GAIA Benchmark Agent

This Hugging Face Space hosts a GAIA (General AI Assistant) benchmark agent designed to solve certification challenges across various domains of AI and machine learning.

Features

  • Processes questions from the GAIA benchmark
  • Uses LangChain and OpenAI's language models
  • Analyzes questions and identifies their types
  • Retrieves relevant context when needed
  • Generates accurate, well-reasoned answers
  • Integrates with external information sources:
    • SerpAPI for real-time web search capabilities
    • YouTube for video content search and transcript analysis
    • Tavily for AI-optimized search results
    • Audio processing for speech-to-text conversion and analysis

Usage

  1. Log in to your Hugging Face account using the button
  2. Click 'Run Evaluation & Submit All Answers' to:
    • Fetch questions from the GAIA benchmark
    • Run the agent on all questions
    • Submit answers and see your score

Implementation Details

The agent uses a modular architecture with specialized handlers for different question types:

  • Factual knowledge questions
  • Technical implementation questions
  • Mathematical questions
  • Context-based analysis questions
  • Ethical/societal impact questions
  • Media content questions (videos, podcasts, audio recordings)
  • Current events questions
  • Categorization questions with enhanced botanical classification

Botanical Classification

The agent has been enhanced with comprehensive botanical classification capabilities, allowing it to:

  • Accurately distinguish between botanical fruits and vegetables
  • Provide detailed explanations of botanical classifications
  • Correctly identify commonly misclassified items (tomatoes, bell peppers, cucumbers, etc.)
  • Explain the difference between botanical and culinary classifications

External Information Sources

The agent can access external information to provide more accurate and up-to-date answers:

  • SerpAPI Integration: Enables real-time web search capabilities for current events and factual information
  • YouTube Integration:
    • Search for relevant videos on specific topics
    • Extract and analyze video transcripts for information
  • Tavily Search: AI-optimized search engine that provides relevant results for complex queries

Audio Processing Capabilities

The agent has been enhanced with audio processing capabilities, allowing it to:

  • Transcribe audio files using OpenAI's Whisper API with Google Speech Recognition fallback
  • Extract ingredients from recipe audio recordings
  • Process and analyze spoken content from various audio formats
  • Format responses according to user requests for audio content

API Keys Configuration

To use the external information sources, you need to set the following API keys in your environment:

  • SERPAPI_API_KEY: For web search capabilities
  • YOUTUBE_API_KEY: For YouTube video search and transcript analysis
  • TAVILY_API_KEY: For AI-optimized search results
  • WHISPER_API_KEY: For audio transcription (defaults to OPENAI_API_KEY if not set)

Repository

The code for this agent is available at: https://huggingface.co/derkaal/GAIA-agent

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support