Teaching Data Literacy with Hugging Face's AI Sheets

Community Article Published June 30, 2025

A step-by-step guide that shows how to go from questions to actual data, one column at a time.

Data is the cornerstone of AI, but the real learning happens when you understand how that data is built, step by step, out in the open. Last weekend, I sat down with my kid to discuss charts, graphs, and how to derive meaningful conclusions from data. This was part of a class project he is doing. I wanted to emphasize that data isn't just about numbers rather it's about asking the right questions, organizing the answers clearly, and understanding what the patterns mean. However, teaching a new concept to a fifth grader isn't easy. So I leaned into something he was already learning in class i.e, animal habitats and adaptations. I figured I'd use the same ideas to bring this lesson to life. But how?

Show me the Data?

Data! Data! Data! I can't make bricks without clay.

That's Sherlock Holmes in The Adventure of the Copper Beeches, snapping at Watson. He was stating the obvious i.e. you can't do analysis without data. The first step was to build a proper dataset. I opened a spreadsheet and we added names of the animals that he knew or was interested in knowing about. That was all we had. Just names!

Image description
A spreadsheet consisting of the names of Animals in the first column.

Brainstorming

We started by brainstorming what we actually wanted to know about each animal. What kind of questions would help us learn something new? What would make the dataset interesting enough to explore? Each question would eventually become a column in the dataset

Image description
Questions that we needed answers to. These would eventually become columns of the dataset we wanted to create..

Answering all those questions for every animal would take forever. Not impossible, though, but it would mean a ton of web searches. Was there a better way? Perhaps there was, and that’s where AI Sheets come in.

AI Sheets to the Rescue

A few days ago, I saw the following tweet on my timeline:

image/png

This looked like something I could use right away, and that's exactly what I did. In this article, I've shared my own, unbiased experience with the tool, starting from a single column and gradually building a complete, rich dataset. It's a great playground for experimentation, letting you test ideas quickly while staying in control throughout.

What are AI Sheets?

image/png

Before moving further, let's take a moment to understand AI Sheets in a bit more detail. These Sheets are like a spreadsheet on steroids. Developed by the Hugging Face team, it's a tool that brings the power of LLMs directly into a familiar table interface. In other words, instead of just storing data, you can ask it to create data. It connects to a wide range of open-source models from the Hugging Face Hub and can also pull live information from the web. You write a prompt, and it fills out the column for you. Since it uses real web results when needed, the facts are verifiable. You can explore them here: https://huggingface.co/spaces/aisheets/sheets

Now that you know what Sheets can do, let's get back to building our dataset.


The Starting Point

We imported our CSV file, which contained a single column of animal names, into the Sheets environment. This would act as the seed column for the dataset. Once imported, the names were populated in Column A of the sheets as shown below.

image/gif

Adding Basic Information

The next step was to fill in some basic information about the animals. We wanted to capture a few foundational facts about each animal, things like what their scientific name is, habitat, and what they eat. For each of these questions, we added a column and wrote a precise prompt to help the model fetch the right information from the web. Here's how it looked for the second column:

  • Column B: Scientific Name Prompt: What is the scientific name of the {{Animal Name}}? Only mention the name.

image/gif

I did a quick Google search to ascertain if the obtained information was actually correct. By clicking the 🌐 icon, I could easily access the sources and verify the results. This is the beauty of this tool: you can manually edit stuff you think is incorrect or else let it take over. For instance, I got Loxodonta africana as the scientific name of the elephant. But when I checked the sources, I realized this refers specifically to the African Elephant. So I had to be more precise about which kind I meant. I edited the corresponding entry in Column A and renamed it to African Elephant.

image/png

We followed the same process for the next three columns as well :

  • Column C: Habitat Prompt: What is the natural habitat of the {{Animal Name}}?

  • Column D: Diet Prompt: What does the {{Animal Name}}typically eat?

  • Column E: Average Lifespan Prompt: What is the average lifespan of {{Animal Name}}? Answer age in numbers only

Instead of manually googling facts for every animal, I could ask a clear, well-phrased question once and apply it down a column. The dataset had started to take shape. This is how it looked at the end of our second step.

image/png

💡 Tips to Improve Feedback

  • Refine Prompts to Improve Output  The way you phrase a question matters. Even a small tweak in wording can completely change the kind of answer you get. If something feels off, just adjust the prompt and re-run it. You can keep refining until the output feels right.
  • Use Feedback to Guide the Model  Clicking the 👍 helps the app learn from good examples. Over time, this improves the quality of future completions for that column.

Adding more detail

For each animal, we wanted a bit more than just physical traits. Understanding how animals adapt to their surroundings helps link their behavior to environmental pressures, so we included this in our next column.

  • Column F: Adaptation
  • Prompt: What are some adaptations that help the {{Animal Name}} survive in its environment?

💡 These insights would also help set up the next column, where we will summarize and synthesize what we'd learned.

Summarizing Key Facts

After identifying unique adaptations, we needed a way to make the information more digestible. The goal was to pull out the most important points from a longer paragraph and capture them clearly. At the same time, we wanted to keep the original column in case we needed to refer back to it. So we created a new column.

  • Column G: Adaptation Summary
  • Prompt: Based on its {{Adaptation}}, summarize it in 5 keywords. The keywords shouldn't include {{Animal Name}} , the word: Adaptation, or anything which is not an adaptation

image/png

Comparing the model's answer side by side to ascertain which output works for our use case. This step helped simplify complex ideas, which would be easier to visualize later on. As shown above, we compared results from two different models to see which worked better for our use case.

💡 Different models handle summarization differently, hence the results depend a lot on model choice. There are a lot of open models available, and Sheets makes swapping models easy and visible, which helped us compare outputs in a real, hands-on way.

Creating a Derived Column

Once we had enough information across different columns, we wanted to see if we could generate something new by combining what was already there. So we added a derived column that pulled from earlier fields:

  • Column K: Food Chain Role
  • Prompt: Based on the {{Habitat}} and {{Diet}} of the {{Animal Name}}, is it most likely a herbivore, carnivore, omnivore, or scavenger in its food chain?

💡 *We didn't need a web search for this one. The data already in the sheet was enough. It's a good example of how existing fields can work together to create new insights

image/png

Adding Visuals

Sheets also support text-to-image generation, which means we could go beyond text and create visuals directly in the sheet. Kids learn better when they can see what they're reading. This time, we didn't have to give a prompt since image generation is available as a method in the dropdown. Just like with text, there are plenty of open-source models to choose from. We chose FLUX.1 from Black Forest Labs since the results were great.

image/png

The Final Result

After completing these steps, our simple list turned into a rich, multi-faceted dataset, which you can then download in CSV or Parquet format. We went from a single column to something meaningful, and then used it for data visualisation. Here's how the final version looks:

image/gif


Conclusion

In this article, I shared my experience of using Sheets. I started with just one column and gradually built it up. I could fix cells manually or feed examples to improve the suggestions using AI. Since each column had its own prompt and logic, I didn't have to keep starting over like you often do in a chatbot. You stay in control, the sources are visible, and it actually scales. Sheets is a great tool, and I'm looking forward to the new additions the team makes to this product.

Thanks to Amélie Viallet for reviewing the article.

Community

Amazing work. This article inspires me to try AI Sheets, not just for my kids but for my own work too.

·
Article author

Thanks @Sjangz .Definitely try them. AI Sheets are great for research.

Sign up or log in to comment