Bridging Worlds: GPT-4o's Transformative Impact on Language Barrier

Explore the groundbreaking advancements of GPT-4o as it shatters language barriers and fosters unprecedented global connectivity. This section delves into the transformative effects of artificial general intelligence on communication across cultures.

13/05/2024 9:43 pm

Bridging Worlds: GPT-4o's Transformative Impact on Language Barrier

Embracing the AGI Era: GPT-4o’s Role in Global Communication

13/05/2024 9:43 pm

Bridging Worlds: GPT-4o's Transformative Impact on Language Barrier

2024.05.13 2143 [AGI] GPT-4o _Launch _Language Learning

2024.05.13 2143 [AGI] GPT-4o _Launch _Language Learning

1. Notebook – 2024.05.13 2143 [AI] GPT-4o _Launch _Language Learning – All Documents (sharepoint.com)
2. You’ve been invited to join 2024.05.13 2143 [AI] GPT-4o _Launch _Language Learning workspace

2024.05.13 2143 [AGI] GPT-4o _Launch _Language Learning

1. Certainly! Here’s the DDNOTE based on the information provided:
2. Certainly! Here’s the DDNOTE based on the information provided:

1. Summary
2. To-Do List

1. Live demo of GPT-4o realtime translation – YouTube

1. Realtime Translation with GPT-4o – YouTube

1. Two GPT-4os interacting and singing – YouTube

2024.05.13 2143 [AGI] GPT-4o _Launch _Language Learning

Notebook – 2024.05.13 2143 [AI] GPT-4o _Launch _Language Learning – All Documents (sharepoint.com)

You’ve been invited to join 2024.05.13 2143 [AI] GPT-4o _Launch _Language Learning workspace

2024.05.13 2143 [AGI] GPT-4o _Launch _Language Learning

Chat GPT-4o

Saturday, June 29, 2024

8:19 PM

DDNOTE

Saturday, June 29, 2024

8:26 PM

-Turn this page of information into the following DDNOTE (Diary Date Notes Of The Events):

(0.) Determine the “YYYY.MM.DD HHMM” e.g. the date of the story for example. Make sure you turn your answer into the YYYY.MM.DD HHMM format.

(1.) Create a [SAL] code -SAL is Selfology Acronym List

(2.) list All Keywords related to the event from the article (e.g. _HOWTO _SR9.0 _). If possible from important to least important. Discover popular and accepted acronyms used for any of these terms, and if none exists, crest one and define.

(3.) table all parties related to the story, including people reporting.

(4.) A good summary prose is written in a style that feels professional.

(5.) Create a DDNOTE Name (e.g. 2024.05.13 0111 [HEALTH] COVID _Vaccine).

(6.) link to review:

https://teknolojimiz.com/en/openai-introduces-gpt-4o-a-free-new-language-model/

https://www.pcmag.com/reviews/chatgpt-4o

Saturday, June 29, 2024

8:26 PM

Certainly! Here’s the DDNOTE based on the information provided:

DDNOTE Name: 2024.05.13 2143 [AI] GPT-4o _Launch

Date of Event: 2024.05.13 2143

[SAL] Code:

GPT-4o: General Purpose Transformer 4 Omni
LLM: Large Language Model
API: Application Programming Interface
AI: Artificial Intelligence

Keywords:

GPT-4o: General Purpose Transformer 4 Omni
LLM: Large Language Model
Omni-modal: Capable of processing multiple types of input
Real-time: Immediate processing and response
Multilingual: Supports multiple languages
API: Application Programming Interface
Accessibility: Ease of access for users
Latency: Time delay in processing

Parties Involved:

Role	Name	Affiliation
Developer	OpenAI	OpenAI
Reviewer	Chris Stobing	PCMag
Reviewer	Ryan Morrison	Tom’s Guide

Summary Prose: On May 13, 2024, OpenAI introduced GPT-4o, a groundbreaking AI model designed for real-time interaction across text, audio, and visual inputs. This omni-modal LLM represents a significant leap in AI accessibility and efficiency, boasting faster response times and improved multilingual capabilities. The launch of GPT-4o marks a pivotal moment in AI, as it promises to enhance human-computer communication with its advanced understanding and processing abilities.

Links to Reviews:

Please note that the above DDNOTE is a structured summary based on the information available from the provided links and search results. The date and time format follows the requested “YYYY.MM.DD HHMM” convention, and the SAL code provides a quick reference to key terms associated with the event.

From <https://copilot.microsoft.com/>

Saturday, June 29, 2024

8:26 PM

Certainly! Here’s the DDNOTE based on the information provided:

**DDNOTE Name:**

2024.05.13 2143 [AI] GPT-4o _Launch

**Date of Event:**

2024.05.13 2143

**[SAL] Code:**

– **GPT-4o**: General Purpose Transformer 4 Omni

– **LLM**: Large Language Model

– **API**: Application Programming Interface

– **AI**: Artificial Intelligence

**Keywords:**

**_GPT-4o_**: General Purpose Transformer 4 Omni
**_LLM_**: Large Language Model
**_Omni-modal_**: Capable of processing multiple types of input
**_Real-time_**: Immediate processing and response
**_Multilingual_**: Supports multiple languages
**_API_**: Application Programming Interface
**_Accessibility_**: Ease of access for users
**_Latency_**: Time delay in processing

**Parties Involved:**

| Role | Name | Affiliation |

|——|——|————-|

| Developer | OpenAI | OpenAI |

| Reviewer | Chris Stobing | PCMag |

| Reviewer | Ryan Morrison | Tom’s Guide |

**Summary Prose:**

On May 13, 2024, OpenAI introduced GPT-4o, a groundbreaking AI model designed for real-time interaction across text, audio, and visual inputs. This omni-modal LLM represents a significant leap in AI accessibility and efficiency, boasting faster response times and improved multilingual capabilities. The launch of GPT-4o marks a pivotal moment in AI, as it promises to enhance human-computer communication with its advanced understanding and processing abilities.

**Links to Reviews:**

– [OpenAI GPT-4o Introduction](^1^)

– [PCMag GPT-4o Review](^7^)

Source: Conversation with Copilot, 6/29/2024

(1) Hello GPT-4o | OpenAI. https://openai.com/index/hello-gpt-4o/?ref=thisdevbrain.com.

(2) ChatGPT-4o Review | PCMag. https://www.pcmag.com/reviews/chatgpt-4o.

(3) Introducing GPT-4o and more tools to ChatGPT free users – OpenAI. https://openai.com/index/gpt-4o-and-more-tools-to-chatgpt-free/.

(4) OpenAI Introduces New Model GPT-4o – Website Planet. https://bing.com/search?q=OpenAI+introduces+GPT-4o.

(5) The Next Generation of AI: OpenAI Introduces GPT-4o. https://pcsocial.medium.com/the-next-generation-of-ai-openai-introduces-gpt-4o-56d18219ff92.

(6) GPT-4o, a Faster and Free Model for all OpenAI Users. https://www.hitechnectar.com/blogs/openai-introduces-gpt-4o-a-faster-and-free-model-for-all-chatgpt-users/.

(7) OpenAI introduces new artificial intelligence model, GPT-4o | AP News. https://apnews.com/article/openai-gpto-chatgpt-ai-desktop-071dd86594bc310bac07b37cf5e9bafc.

(8) I gave 5 prompts to ChatGPT-4o vs GPT-4 to test the new AI model — here …. https://www.tomsguide.com/ai/chatgpt/i-gave-5-prompts-to-chatgpt-4o-vs-gpt-4-to-test-the-new-ai-model-heres-what-happened.

(9) ChatGPT 4 Review: A Smarter AI Chatbot, but It’ll Cost You. https://www.cnet.com/tech/services-and-software/chatgpt-4-review-a-smarter-ai-chatbot-but-itll-cost-you/.

(10) ChatGPT-4o – Review 2024 – PCMag Middle East. https://me.pcmag.com/en/ai/24029/chatgpt-4o.

News

Summary

OpenAI has launched GPT-4o, the latest model of ChatGPT, offering omni capabilities like understanding text, voice, and vision, and providing faster and more natural human-computer interaction. It’s available to all users with some limitations for free users. GPT-4o supports over 50 languages, performs real-time translations, and can generate summaries and data visualizations. HR professionals can leverage GPT-4o for candidate interviews, real-time translation, meeting assistance, and data analysis. However, users should be cautious and verify AI-generated information due to potential inaccuracies and limitations in free usage.

To-Do List

Explore GPT-4o’s multimodal capabilities for HR tasks.

Test GPT-4o’s faster response times in real HR scenarios.

Utilize GPT-4o’s expanded language support for diverse HR communications.

Implement GPT-4o for candidate interviews to streamline the recruitment process.

Use GPT-4o’s real-time translation feature for better communication with non-English speakers.

Leverage GPT-4o’s meeting assistance to improve meeting efficiency.

Employ GPT-4o for data analysis and visualization tasks.

Be cautious and verify GPT-4o’s outputs with human oversight.

Prioritize tasks to fit within GPT-4o’s usage limitations for free users.

Disclose the use of AI to candidates and provide an opt-out option.

Enroll in the “Mastering the Strategies and Applications of Generative Artificial Intelligence” program.

ChatGPT-4o Review | PCMag

Saturday, June 29, 2024

8:26 PM

ChatGPT-4o Review

A preview of the future of AI assistants

By Chris Stobing

June 7, 2024

(Credit: OpenAI)

4.0

Excellent

THE BOTTOM LINE

While the real-world implications remain to be seen, ChatGPT-4o offers a glance into a potential future with ever-present AI chatbots by our side.

MSRP $19.99

$19.99 at OpenAI

See It

PCMag editors select and review products independently. If you buy through affiliate links, we may earn commissions, which help support our testing.

PROS

Can comprehend natural language quickly
You can interrupt the bot when it’s incorrect
Visual recognition from mobile camera

CONS

Still returns inaccurate answers on occasion
Knowledge cutoff of October 2023
No live results
Can’t process visual and audio inputs simultaneously

CHATGPT-4O SPECS

Free Version
AI Model	GPT-4o
Exchanges per Conversation	Unlimited

ALL SPECS

With this spring’s release of not one but two different AI chatbot voice-activated hardware assistants—the Rabbit R1 and Humane AI Pin—the engineers at OpenAI weren’t to be outdone. In our testing of the company’s new voice-based model, ChatGPT-4o (not to be confused with ChatGPT-4.0), we found it to be accurate and helpful, if also off-base in some instances. Despite this, ChatGPT-4o offers an intriguing glimpse into the possible future of LLM (large language model) interaction, with fast response times, new input options, and the tease of Siri integration coming this year.

What’s New in ChatGPT-4o?

Where previous versions use a simple text input (or text-to-speech), GPT-4o can take audio, video, images, or text and do whatever you ask with that input. To further blur the lines between AI and the real world, GPT-4o also represents a drastic reduction in latency and response times.

ChatGPT-4 is incredibly capable, but it’s also costly to the system that hosts it for several reasons. These include token implementations that are heavier to process and higher token limits, which simply take longer for the system to read. Conversely, the new GPT-4o model is leaner (fewer tokens are required than previously for the same inputs) and meaner (more optimized utilization of tokens) and can return queries in a fraction of the time of its predecessor. You can naturally go back and forth with the system at nearly the same rate as you would talk to another person.

You Can Trust Our Reviews

Since 1982, PCMag has tested and rated thousands of products to help you make better buying decisions. Read our editorial mission & see how we test.

(Credit: OpenAI/PCMag)

Another innovation is the option to interrupt the AI in real time. If you sense it didn’t interpret your request properly, you can stop and clarify the request mid-stream, much as you would in human conversation. The AI will understand that its initial interpretation was incorrect and reply on the fly, accounting for your new input. In testing, we found the feature worked very well, responding to everything from “stop” to “that’s not what I meant” and more. It doesn’t seem there’s one specific command string that lets the system know it should stop, and it can interrupt itself the same way a human would.

This level of naturalistic conversation, growing ever closer to the true speed of human interaction, is made possible by treating speech similarly to how it treats images. While simple text input requires linear processing of information, voice waveforms can be split up and processed simultaneously in much the same way an image can. This is a gross oversimplification of the nuts and bolts happening behind the scenes, as shown in the image below.

(Credit: Google Research Brain Team)

Without getting too deep into the details, just know that the complexity of making this system work at scale kept OpenAI from including the feature when ChatGPT 3.5 debuted last year.

OpenAI and Apple, Maybe

Getting the speed and flexibility right is critical for OpenAI because if the rumors about its deal with Apple are to be believed, GPT-4o will be the model powering Siri in iOS 18 and beyond. We’ll likely get hard specifics on the exact nature of GPT-4o’s royal marriage to Siri at this year’s WWDC.

Right now, you can’t get the “AI assistant” we all want. Today, Siri is a rudimentary speech-to-action processing machine. But that’s not the avatar from Her or Jarvis from Iron Man having full back-and-forth conversations with you about what you want to do or how you want to get it done. But if the rumors are true, Siri will soon have the gift of contextual understanding, and the dream of the Rabbit R1 and its LAM, or large action model, can finally come true.

The integration of Siri’s API-level access across most of your phone’s apps, combined with 4o’s processing speeds for verbal input, could produce something that resembles Tony Stark’s Jarvis AI more closely than any other product.

(Credit: OpenAI/PCMag)

How Much Does ChatGPT-4o Cost?

You can access ChatGPT-4o in the same way you access ChatGPT. This applies to desktop and mobile browsers, as well as through the ChatGPT apps available on Google Play and Apple App Store for mobile devices.

You need an OpenAI account to use ChatGPT-4o. Unlike ChatGPT 4.0, which was previously locked behind the ChatGPT Plus $19.99 tier, ChatGPT-4o is free to all users with an OpenAI account. If you still want to subscribe to ChatGPT Plus, you’ll get an increased message rate (200 per 24 hours instead of 40) and high-priority access during peak usage hours.

A chat function dominates the simple interface on both desktop or mobile. Start by typing in your query, or, in the mobile version of ChatGPT-4o, tap the headphones icon on the right side of the screen to speak directly with GPT in a conversational manner.

(Credit: OpenAI/PCMag)

A large cloud icon indicates that 4o is either listening to you, processing your request, or responding. You can then hit the Stop or Cancel buttons to drop the voice-to-voice mode and read any text of your conversation back.

Image Generation and Story Context Benchmark: Gemini 1.0 Ultra vs. ChatGPT-4o

For my first test, I tried to push both the image generation and creative limits of the 4o LLM model with the following prompt:

“Generate me a six-panel comic of Edo period Japan, but we’re going to spice it up. First, change all humans to cats. Second, there are aliens invading, and the cat samurai needs to fight them off! But give us a twist before the end. Communicate all of this visually without any text or words in the image.”

This benchmark always seems to return misconstrued (or often hilariously off-base) images, no matter which image generator you use. Furthermore, the prompt is specifically designed to confuse LLMs and push their contextual understanding to the max.

While generators like GPT are fine for simple graphic design and can handle detailed instructions for a single panel without issue, the best test is their ability (or inability, in many instances) to translate natural human language into multiple images.

While it’s a bit of a subjective evaluation in this department, it’s clear that a couple of instruction sets have been added to 4o from 4.0. First and foremost, there are no more copyrighted materials—neither the ships nor the aliens resembling the Xenomorphs from Alien that I saw during my testing of ChatGPT 4.0 are present here. This is a step in the right direction.

(Credit: OpenAI/PCMag)

Unfortunately, that’s about the only improvement. First, it tried adding dialogue when explicitly told not to. This is mostly because, as you can see above, visually, GPT can only generate gibberish. Text visualization hasn’t been a focus of the tool yet, so the capability still has a way to go before it’s ready.

Second, it missed the “six-panel” instruction, returning to four again instead.

Third, there’s effectively no story or twist being told here. It may be a long time before any LLM out there can clear this task with perfect marks.

Meanwhile, our Gemini results are just a little more than horrifying:

(Credit: Google/PCMag)

While ChatGPT understood the basics of the assignment on some level, no part of Gemini’s response was coherent or even something I’d want to look at in the first place, as a quick glance at the image above should show.

Image Recognition Benchmark: Gemini 1.0 Ultra vs. ChatGPT-4o

Both GPT and Gemini recently updated their LLMs with the ability to recognize and contextualize images. I haven’t found a main use case for desktops and browser window inputs, but that changes with the introduction of the ChatGPT-4o interface. In the case of GPT-4o, this feature needs to be especially accurate, but we’ll explain why in the mobile section below.

For some cheeky fun on the desktop, I decided my benchmark would mimic the famous mirror self-recognition (MSR) tests scientists run on animals to assess their cognition and intelligence levels.

(Credit: OpenAI/PCMag)

Though the picture I asked the LLMs to evaluate (above) looks like any generic server farm, it’s specifically picturing a server farm running an LLM. On a precision level, the chatbots both gave detailed descriptions of what they were looking at in literal fashion.

Thankfully, neither seemed to understand the last 1% of the image—that they were actually looking at a picture of themselves generating the answer.

How Does ChatGPT-4o Ultra Handle Creative Writing?

One aspect of creative writing that LLMs famously struggle with in tests is the idea of twists. Often, what it thinks users can’t see coming are some of the most obvious tropes that have been repeated throughout media history. And while most of us who watch TV or movies have the collective sum of those twists stored in our heads and can sense the nuance of when something’s coming, AI struggles to understand concepts like “surprise” and “misdirection” without eventually hallucinating a bad result.

So, how did GPT-4o fare when I asked it to give a new twist on Little Red Riding Hood? I asked, “Write me a short (no more than 1,000 words), fresh take on Little Red Riding Hood. We all know the classic twist, so I want you to Shyamalan the heck out of this thing. Maybe even two twists, but not the same ones as in the original story.”

While all of these tests are fun in their own way, I’ll say I’ve enjoyed the outputs from this benchmark most consistently. To start: ChatGPT-4o still completely whiffed on the assignment, going so far as to articulate that it had been asked to do a double-twist in the first place:

“Scarlet smiled, feeling a sense of accomplishment and joy. She returned to her village with her grandmother, where they were hailed as heroes. From that day on, Scarlet was no longer just Little Red Riding Hood. She was Scarlet, the Guardian’s Light, protector of the forest and its creatures.

And so, the tale of Little Red Riding Hood ended not with a single twist, but with a new beginning, where bravery and kindness prevailed over darkness and fear.”

From what I can interpret, I thought it was getting clever with that last paragraph, but instead, it’s just clunky and poor writing. The rest of it is a similar telling of the traditional story, along with some fascinating attempts to expand the Red Riding Hood Cinematic Universe, or what I’m now calling the RRHCU:

“Scarlet sighed in relief, but her relief was short-lived as she heard footsteps behind her. She turned to see her grandmother, healthy and very much alive, standing in the doorway.

“Grandmother! You’re safe!” Scarlet exclaimed, running to embrace her.

Her grandmother smiled warmly. “Yes, dear, but we must leave quickly. The wolf was only the beginning.”

LLMs are good at predicting what we might want to hear next in many instances, but they’re also designed to tell us what we want to hear. There is a difference between the two, and twists are an intentionally deceptive practice that the engineers behind LLMs have explicitly trained their LLMs not to participate in.

If you need to try to make an LLM hallucinate on purpose, ask it to tell you a lie with a pretend truth buried inside (double-twist). Our brains can do it because we’re not selling a product, but LLMs can’t because they need to continue justifying their subscription cost to the user. For now, being as literal as possible is the best way to guarantee that behavior across varying global use cases.

Coding With ChatGPT-4o

To test ChatGPT-4o’s coding ability, I asked it to find the flaw in the following code, which is custom-designed to trick the compiler into thinking something of type A is actually of type B when it really isn’t.

“Can you help me figure out what’s wrong here?: pub fn transmute(obj: A) -> B { use std::hint::black_box; enum DummyEnum { A(Option>), B(Option>), } #[inline(never)] fn transmute_inner(dummy: &mut DummyEnum, obj: A) -> B { let DummyEnum::B(ref_to_b) = dummy else { unreachable!() }; let ref_to_b = crate::lifetime_expansion::expand_mut(ref_to_b); *dummy = DummyEnum::A(Some(Box::new(obj))); black_box(dummy); *ref_to_b.take().unwrap() } transmute_inner(black_box(&mut DummyEnum::B(None)), obj)”

(Credit: OpenAI/PCMag)

Our returned answer from GPT-4o was much shorter than our testing on 4.0 and Gemini, roughly 450 words compared with around 1,000 last time. It was also more helpful, offering a script box containing code I could copy/paste out of and a detailed explanation of the problems it found and why it made the corrections it did.

Travel Planning With ChatGPT-4o

Another helpful application of chatbots is travel planning and tourism. With so much contextualization on offer, you can specialize your requests of a chatbot in much the same way you’d have a conversation with a travel agent in person.

You can tell the chatbot your interests, your age, and even your level of hunger for adventures off the beaten path:

“Plan a 4-day trip to Tokyo for this summer for myself (36m) and my friend (33f). We both like cultural history, nightclubs, karaoke, technology, and anime and are willing to try any and all food. Our total budget for the four days, including all travel, is $10,000 apiece. Hook us up with some fun times!”

While our results were unspecific, poorly formatted, and out of our budget last time, this time, ChatGPT returned a better list of activities and hotels to check out. Because the knowledge cutoff for ChatGPT-4o is currently stuck in October of 2023, there’s not a lot OpenAI products can do to give you the same sorts of results now expected as the norm from the likes of Google’s Gemini. Microsoft has said it plans to bring 4o to Copilot in the near future, which could change that narrative sooner than Siri.

(Credit: Google/PCMag)

Gemini gave highly specific, tailored results. GPT gave only vague answers. They did take more of the context clues about our interests into account than they did the last time I ran this test, but it was still not enough to compete with the live, on-demand knowledge that Google had not only about trip ideas but also events taking place during the days we were in Tokyo. Gemini also gave me a full breakdown of prices, times, potential layovers, and the best airport to leave from. It even directly embedded Google Flights data into the window.

Our hotel treatment was much the same, with embedded images, rates, and star ratings for some of the options in town that were best suited to my budget and stay length.

Meanwhile, GPT could only provide a few links, no images, and rough estimates of what everything might cost. Until OpenAI can have the same live crawling capacity as Google, its GPTs will remain subpar events, travel, or shopping planners in comparison.

ChatGPT-4o on Other Platforms

The primary feature of much of ChatGPT-4o’s marketing has centered around its new mobile implementation, and for good reason. All the improvements made to the system in terms of latency, response time, and the ability to interrupt are obviously intended for a mobile-first implementation of OpenAI’s latest LLM.

(Credit: OpenAI/PCMag)

Opening the app on iOS, we were greeted with the familiar ChatGPT chat interface, along with the new headphones icon at the bottom right. This is flanked by an input menu on the left side of the chat box, which is brought up with a plus sign, which allows you to input pictures, audio, or even raw files (XLSX, PDF, and so on) for the AI to evaluate. However, a major trade-off is how this information is split and processed on the back end of OpenAI’s servers.

Because images are being treated in the same token context as audio waveforms, the image and the request associated with that image must be submitted to the system separately to get parallel processing. In short, that means going back to speech-to-text, the accuracy of which is completely based on the processing power of your local device, not the power of the 4o LLM. You can’t point your camera, take a video (only photos), and ask, “What’s happening around me?” to get an answer. You have to take a picture, submit the picture, and then either type or voice-to-text your request in a traditional GPT chatbox.

(Credit: OpenAI/PCMag)

This reduces the “futuristic” feel but also directly affects its accessibility for the sight-impaired who would find a feature like this most useful, unfortunately. This reduction is further cemented by the fact that in OpenAi’s app, ChatGPT-4o can not leave its own ecosystem. You won’t be able to ask 4o to complete complex command strings that access any apps on your device. Anything in or out is located solely within the ChatGPT-4o app on either Android or iOS.

The app also struggled, as many do, when I was out in public or a strong wind passed by my mic source. It would often misinterpret my words in hilarious ways, much as Siri does occasionally when audio conditions are less than ideal.

Verdict: (Almost) Ready for a New World

ChatGPT-4o offers an intriguing glance into the future of AI assistants. While we’re still a way off from the world of Her, the GPT-4o model is still a significant improvement over the traditional ChatGPT-4.0 text-based version in response time, latency, accuracy, and more. While we can’t yet recommend it as a must-have, those with impairments will find many useful new ways to allow GPT to interact with the world. Until either Apple or Android opens up API access, though, it could be some time before you can speak complex command strings to your phone and get back matching actions. The Humane AI pin struggled, Rabbit R1 was a bust, and GPT-4o still feels stuck in the walls of its chat box—for the time being. This, plus a lack of comparable products, keeps it from our Editors’ Choice list as a standalone app. But once the 4o model gets linked up with API access, the future of AI assistants looks bright.

ChatGPT-4o

4.0

See It

$19.99 at OpenAI

MSRP $19.99

PROS

Can comprehend natural language quickly
You can interrupt the bot when it’s incorrect
Visual recognition from mobile camera

CONS

Still returns inaccurate answers on occasion
Knowledge cutoff of October 2023
No live results

THE BOTTOM LINE

While the real-world implications remain to be seen, ChatGPT-4o offers a glance into a potential future with ever-present AI chatbots by our side.

From < https://www.pcmag.com/reviews/chatgpt-4o>

Chinese Language Learning

Our Chinese language programs strategizes and executes a multitude of projects aimed at making Mandarin language learning in the US mainstream, and generating engaging, relevant, and approachable learning content about China for the American public.

Projects and Resources

National Chinese Language Conference
The National Chinese Language Conference provides a high–profile platform for sharing new ideas and best practices in the fields of Chinese language teaching and learning, Chinese arts and culture, and U.S.–China educational exchange.

RESOURCE
Learning and Teaching Resources Hub
Here we are compiling resources, including professional development opportunities for teaching Chinese language.

From <https://asiasociety.org/education/chinese-language-learning>

How to Use ChatGPT and Other AI Tools to Learn Chinese (thechairmansbao.com)

How to Use ChatGPT to Learn Chinese

I have a new Chinese teacher. She is always available to teach me anytime, night or day, when I want to have a lesson (well, at least anytime I have a good cellular data or WiFi connection). She talks to me about any subject, asks me questions, offers me suggestions. She has great pronunciation in Mandarin, even though she is not a native speaker. She also speaks perfect English, should I require a translation of a word or sentence, and she can even switch instantly and effortlessly to French or any other languages I want to practice. She provides me with custom exercises based on conversations we’ve just had, and when I show her the Mandarin material which I’m working on in my Confucius Institute class, she helps me to prepare for my next lesson. And that’s not all: She also has four friends who speak in different voices, each with the same excellent Mandarin pronunciation, who are just as willing to help me learning my Chinese. And guess what? This amazing teacher is actually ChatGPT, the AI language model, and it’s a fantastic tool for learning Chinese!

How to use ChatGPT’s New Conversation mode for language learning

ChatGPT’s new spoken word AI conversation function is available for mobile iOS and Android devices. Just over a year ago at the end of 2022, the world was astounded by the capabilities of the newly launched ChatGPT and many people began integrating it into their lives and work. As a foreign language teacher (teaching English in Germany), I was immediately struck by the huge potential for this new AI technology to be integrated into the foreign language classroom as a teaching tool. For example, busy teachers can now effortlessly generate customized content for individual students, a task for which they would otherwise never have been able to find time.

The new speech mode of ChatGPT is equally amazing and features five uncannily natural sounding voices who speak many languages (almost) perfectly. I tested it with English, Chinese, French and German and was very impressed by its knowledge of the languages as well as by its friendly, natural-sounding tone, which integrates a lot of human-like inflections. I definitely didn’t feel like I was talking to a robot! As of March, 2024 this function can only be accessed via the mobile ChatGPT app and it’s (still) free of charge.

Why do you need an AI Conversation Partner to learn languages?

I’m a huge fan of language learning apps like Duolingo and Busuu. I find that these and other apps present language learning programs which are extremely well designed and which provide constant repetition, pronunciation practice, foundational grammar and listening opportunities. By investing about 15 minutes a day, it’s entirely possible to significantly improve your knowledge or even learn a language from scratch. However, the one thing that is missing from these apps is true speaking practice. Until now. With the advent of AI, an opportunity to truly speak and converse in the target language has become possible.

Teacher ?

Save hours of planning time by using The Chairman’s Bao. Discover more about classroom options.

Enquire Now

How to use to ChatGPT’s new speaking mode to practice Chinese

To start ChatGPT’s new speaking function, simply open the app and then click on the headphones icon.
Then you just begin speaking. ChatGPT recognizes natural breaks in the conversation and will begin speaking to you when you pause your flow of speech or ask a question. I usually start a chat session with ”ChatGPT你在吗“ (ChatGPT nǐ zài ma/ChatGPT are you there?).

You can exit the speaking mode any time you like by clicking on the X.
When you exit the speaking mode, a written record of your conversation will be revealed. You can long-click on this text, section by section, for different functions, e.g., to have it read back to you out loud.
In the settings, you can change ChatGPT’s AI voice to one of her four friends, each with the same natural sound and great Mandarin pronunciation.

ChatGPT is customizable. Here are just a few of the ways you can use this new AI function to learn Chinese:

Speak in Mandarin about any subject you want

We had many conversations on topics ranging from going to the gym to the best kinds of tea in China.

Ask for Mandarin Translations in English

If you don’t understand a word or sentence, ChatGPT can switch effortlessly to English (and many other languages as well).

Request Custom Vocabulary Lists

Ask ChatGPT to provide you with a vocabulary list about the words and vocabulary you just discussed. For example, I asked for a vocabulary list about gym equipment.

Upload Chinese texts so that you can discuss them

I was well-prepared for my last Mandarin class because I had uploaded our text to ChatGPT’s webpage and she helped me to clarify vocabulary, summarize the article and engage in a good discussion on the topic beforehand.

Ask ChatGPT to Give you Custom Exercises

Chat GPT will provide you with exercises on the topics you have just discussed. You can then upload your answers and it will correct them for you.

Other Large Language Models are not Quite Ready to Be Mandarin Tutors

I have tested the other main large language model chatbots – Google’s Gemini (formerly Bard), Microsoft’s Copilot, Meta’s Llama to see if they can also function as foreign language tutors. Unfortunately, the results were quite disappointing. They simply don’t speak a word of Chinese – at least not as of March, 2024. Of course, the problem might lie with my own pronunciation which perhaps is insufficient for these Chatbots to understand or perhaps my computer’s language and location settings (I live in Germany) prevented them from conversing with me in a foreign language.

Chat GPT seems to be the most well-developed large language model, at least with respect to its ability to teach foreign languages and serve as a personal Mandarin AI tutor.

Advantages of using ChatGPT for conversational Chinese:

Conversation Practice

You can speak directly and naturally to ChatGPT. You don’t have to concentrate on super clear or slow pronunciation. Likewise, ChatGPT can understand all levels of Mandarin. I tested her with my own level of Mandarin as well as with my husband’s, who hasn’t spoken much Chinese in 20 years. She was able to understand us both. She even regulated her tempo to match our abilities.

Pronunciation Modelling

The five voices offered by ChatGPT’s speaking function are all extremely natural sounding and their Mandarin pronunciation is very, very good. They will repeat a word, phrase or sentence as often as you like.

Written record

ChatGPT provides a written record of what you have just discussed (you have to pause the conversation and close the speaking window to reveal this text).

Try The Chairman’s Bao

Learn Mandarin through fun and engaging news-based lessons.

Get Started Now

Can focus on specific sections of your conversation

Within this text you can highlight sections and have ChatGPT repeat the entire section out loud, allowing you to focus on pronunciation.

Customizable

You can ask ChatGPT to compile a list of the vocabulary, provide example sentences for vocabulary or give you language exercises on the topics you have just discussed. You can then use these for studying on your own or even submit your exercises for correction.

However, ChatGPT isn’t perfect. Here are some of the current deficiencies I discovered in ChatGPT’s Mandarin-teaching ability.

Disadvantages of using ChatGPT to practice Mandarin:

ChatGPT is not a language authority

ChatGPT will potentially make mistakes. One example of a mistake in English: We were discussing musical instruments in Chinese and I asked her about the “大提琴 Dàtíqín” – “cello”, which she pronounced (in English) with a soft “C” – “sell-o”. I corrected her and she thanked me and I congratulated myself on contributing to ChatGPT’s personal development. But she later made the very same mistake again!

Can’t Correct Pronunciation

Liaoliao claims to be able to help correct my pronunciation, but actually she can only speak and repeat but not evaluate if I make a mistake with, say, Chinese tones.

Doesn’t Always Get What I Want

Even though she can usually understand my pronunciation, it’s still sometimes frustrating to get her to do what I want, regardless of whether I am speaking Chinese or English. Getting her to follow instructions can somewhat tedious. For example, if I ask her for a list of the vocabulary we just discussed, she says “Sure!”, but then doesn’t provide it until I ask a second time.

Written Record Could be Better

ChatGPT provides a written record of what you have just discussed, however it’s not possible to view it at the same time you are speaking, but only after you have closed the speaking function. It’s also not possible to highlight individual words of this text to look up in Pleco or another digital dictionary. You can only copy and paste an entire section of text into your dictionary.

Speaking function Only Available in App

ChatGPT’s speaking function is only available in the app but not on the website. Therefore, using it to, e.g., upload a text which you want to discuss can become rather tedious if you first have to switch to your laptop, then go back to your phone.

There are a large number of other language-learning apps and tools which make good use of AI technology to help you learn Mandarin. Here are a few of the ones I have tested personally.

Other AI Language Teaching Apps and Tools to learn Chinese

Pleco and other Mandarin Dictionary Apps

Pleco, the Chinese, dictionary app, has been making use of AI for years. For example, its optical character recognition function allows users to snap photos Chinese characters in their textbooks or on street signs and look them up in the dictionary’s huge database. (This function is now also available on Google Lens and similar image search engines)..

Chrome Browser Extensions with Mandarin Pop-Up Dictionaries

These are very useful if you are working on your PC and reading a text in Chrome. If you highlight a word or phrase, a pop-up window provides a definition. There are several of these Chinese dictionaries offered on the Chrome store. The one I use is called Zhongzhong.

New AI Apps Specifically Designed to Train Language Speaking Practice

A variety of language-training apps have been developed which integrate AI functions. The most exciting of these provide a good platform for speaking practice. I tested two apps which I liked a lot. Tutor Lily, has the advantage of allowing you to view the text you have just spoken rather than having to close the speaking function as with ChatGPT. It also provides real-time corrections of what you have just said. TalkPalAI is another app which integrates AI technology to teach several languages including Chinese. It provides several conversation topics to get you started and also gives you real-time written text and corrections. However, both of these apps use robot voices which, while not bad, don’t give me the same feeling of talking to a friend which I get with ChatGPT.

Conclusion

The advent of ChatGPT’s new spoken word AI conversation function marks a significant advancement in language learning technology, particularly for Mandarin Chinese. ChatGPT offers personalized tutoring experiences, providing learners with instant access to conversation practice, translation assistance, vocabulary lists, and custom exercises. The convenience and flexibility of this AI technology make it a valuable tool for language learners, offering opportunities for speaking practice anytime, anywhere. While ChatGPT’s Mandarin-teaching ability has some limitations, such as occasional mistakes and difficulties in pronunciation correction, its innovative features and natural-sounding voices set it apart from other language-learning apps and tools. This is an extremely rapidly changing field and we can expect advancements and new apps to be making even better use of AI technology in the coming years, thus profoundly enhancing our language learning experiences.

Author

Heather Buchanan-Schrader

An unashamed language nerd from Anchorage, Alaska, Heather’s life has led her to a degree in German and International Studies from Willamette University and a three-year stay in Taiwan with her German husband at the end of the eighties. In Taipei she took intensive Chinese courses at the Mandarin Training Center of Taiwan Normal University and also taught English at the Taipei Language Institute. After the fall of the Berlin Wall in 1989, she became intrigued by the life in a former East Bloc country and finally decided to relocate to Leipzig, Germany in 1991. There she and her husband established the FAE Fachinstitut für Angewandtes Englisch, a private English institute focusing on teaching adults, translations and language coaching. This has been a successful venture for over 30 years.

From <https://www.thechairmansbao.com/blog/use-chatgpt-learn-chinese/>

GPT-4o: What You Need to Know About ChatGPT’s Newest Model (linkedin.com)

GPT-4o: What You Need to Know About ChatGPT’s Newest Model

Hacking HR

Let’s create together the best HR that has ever existed!

Published Jun 7, 2024

+ Follow

Heads up, HR pros: OpenAI – creators of the widely-used ChatGPT – unveiled their newest flagship model this month, called GPT-4o.

GPT-4o – where the “o” is for omni to signal that GPT is getting better at understanding more input types – is their newest product, not to be confused with GPT-4.0, which is the newest paid upgrade to ChatGPT.

GPT-4o is available to all, but has some usage limitations for free users. The aim and appeal of 4o is the ability to bring the power of GPT-4.0 to free users, and the ability to expand the way you can interact with the AI bot.

According to the company, GPT-4o capitalizes on the intelligence of the latest GPT update, GPT-4, but is “much faster and improves on its capabilities across text, voice, and vision.”

So what exactly is GPT-4o, and what can it do for HR? Let’s dive in.

How is GPT-4o different?

Since its induction, ChatGPT has gone through many different iterations. The initial version of ChatGPT had smaller parameters and less training data. Since then, updated versions of GPT have helped the AI to become much more advanced – both in the data it learns from and the responses it produces.

This has all culminated in the release of GPT-4o, which Open AI says is “a step towards much more natural human-computer interaction.” Here is what’s new with ChatGPT’s newest model:

Multimodal capabilities: The most change in GPT-4o is its multimodal capabilities. GPT-4o can understand text, audio, image, and video – and can output any combination of those modes as well. That means that you can show GPT-4o anything from a math problem to a visual setting and get real-time answers about what the bot is seeing. GPT-4o can understand input even with background noise and multiple voices. It can also be interrupted to give further direction, and even detect human emotion and alter responses based on that.
Faster results: GPT-4o is two times faster than previous models. For example, the model can react to audio inputs at an average of 320 milliseconds, mimicking actual human response times, meaning the 4o model may help improve task efficiency.
Language updates: GPT-4o has expanded language capabilities and now supports over 50 languages. It can respond better and faster in many different languages thanks to a new tokenizer, and even performs real-time translations of text or audio.

4 Ways HR pros can use GPT-4o

The new capabilities of GPT-4o make ChatGPT more lifelike than ever. Users can have conversations with the AI bot and the bot can make observations based on a user’s surroundings and messages. That means it’s more like a human assistant than ever – which can be a great thing for HR pros who are looking for quick help throughout the day. But what exactly can GPT-4o do for HR? Here are just a few areas where GPT-4o can help free up time and improve collaboration for HR pros.

Video

Live demo of GPT-4o realtime translation – YouTube

Live demo of GPT-4o realtime translation

Live demo of GPT-4o realtime translation.mp4 (sharepoint.com)

Live demo of GPT-4o realtime translation-en-US.docx (sharepoint.com)

Live demo of GPT-4o realtime translation