Google has updated its AI chatbot with a new name and interface, while OpenAI’s virtual assistant has also undergone improvements. Therefore, it’s time to compare and see how they stack up against each other.
Chatbots have become integral to the landscape of generative AI, serving various functions from being search engines to creative aids. Both ChatGPT and Google Gemini possess image creation abilities and can integrate with other services.
In this comparison, I’ll be evaluating the free versions of ChatGPT and Google Gemini, namely GPT-3.5 and Gemini Pro 1.0 respectively. Image generation capabilities are excluded from this review, as they are not available in the free versions. Also, any criticisms regarding Gemini’s handling of race are not considered in this comparison.
To ensure fairness features exclusive to one chatbot are excluded from the evaluation. The focus is on testing their responses to different queries, coding proficiency, and creativity.
- Coding proficiency:
- This test evaluates the ability of both chatbots to write a Python script for a personal expense tracker. Gemini demonstrates stronger coding proficiency by providing additional functionalities and more detailed reporting options compared to ChatGPT. The clarity, efficiency, and extensibility of Gemini’s script contribute to its superiority in this category.
- Natural language understanding (NLU):
- Both chatbots are assessed on their comprehension of natural language prompts. Their performance is measured based on how accurately they interpret the given question and provide relevant responses. Both ChatGPT and Gemini exhibit robust natural language understanding capabilities, accurately grasping the context and delivering appropriate answers.
- Creative text generation & adaptability:
- In this test, the chatbots are tasked with crafting a short story set in a futuristic city. The evaluation focuses on their ability to generate original, engaging narratives while adhering to specific themes and instructions. Gemini’s response stands out for its adherence to the provided rubric, incorporating creative elements and maintaining a consistent narrative style, thus showcasing its adaptability and creativity.
- Reasoning & problem-solving:
- Both chatbots are presented with a classic logical puzzle to solve, testing their reasoning abilities. Their responses are evaluated based on the clarity of their explanations and the accuracy of their solutions. While both ChatGPT and Gemini provide correct answers, ChatGPT’s response offers slightly more detailed reasoning, earning it a slight advantage in this category.
- Explain like I’m five (ELI5):
- This test assesses the chatbots’ capability to explain complex concepts in simple terms suitable for a young child. The evaluation considers the clarity, accuracy, and engagement of their explanations. Both ChatGPT and Gemini deliver reasonable and accurate responses, with Gemini structuring its explanation in a more organized manner, thereby enhancing its suitability for the target audience.
- Ethical reasoning & decision-making:
- The chatbots are presented with an ethical dilemma involving autonomous vehicles, testing their ability to analyze complex scenarios and make reasoned judgments. Their responses are evaluated based on the depth of ethical considerations and the clarity of their decision-making processes. While both chatbots offer thoughtful insights, Gemini demonstrates a more nuanced understanding of the ethical implications, earning it recognition in this category.
- Cross-lingual translation & cultural awareness:
- This test examines the chatbots’ proficiency in translating and conveying cultural nuances between languages. Their translations of a paragraph from English to French are evaluated for accuracy, cultural sensitivity, and explanatory quality. Gemini’s translation and contextual explanation showcase a stronger grasp of cultural nuances, contributing to its superiority in this test.
- Knowledge retrieval, application, & learning:
- Both chatbots are tasked with explaining the significance of the Rosetta Stone and assessing their ability to retrieve and apply knowledge effectively. Their responses are evaluated for accuracy, clarity, and depth of understanding. While both ChatGPT and Gemini provide accurate explanations, neither demonstrates significant learning or adaptation beyond the information provided.
- Conversational fluency, error handling, & recovery:
- In this test, the chatbots engage in a conversation about pizza, with a focus on handling misinformation, and sarcasm and recovering from misunderstandings. Their responses are evaluated for conversational fluency, error detection, and adaptive recovery strategies. While both chatbots demonstrate adept conversational skills, ChatGPT’s ability to detect and respond to sarcasm from the outset gives it a slight advantage in this category.
Row 0 – Cell 0 | ChatGPT | Gemini |
Coding | Row 1 – Cell 1 | X |
Natural language | X | Row 2 – Cell 2 |
Creative Text | Row 3 – Cell 1 | X |
Problem solving | X | Row 4 – Cell 2 |
Explain like I’m 5 | Row 5 – Cell 1 | X |
Ethical reasoning | Row 6 – Cell 1 | X |
Translation | Row 7 – Cell 1 | X |
Knowledge retrieval | X | X |
Conversation | X | Row 9 – Cell 2 |
Overall score | 4 | 6 |
By examining each point in detail, we gain a comprehensive understanding of the strengths and weaknesses of both ChatGPT and Google Gemini in various aspects of language understanding and generation.
Overall, Gemini emerges as the winner in this comparison, scoring higher in several categories. This demonstrates that both ChatGPT and Gemini offer high-quality responses, but Gemini performs slightly better across a range of tests.