Dataset Upload for AI Agents in GAME Cloud: Guidelines, Common Issues, and Best Practices

By: Joey Lau

GM builders!

Welcome to this guide on custom dataset uploads. We’ll walk you through the essentials, covering:

  • The motivation behind uploading datasets

  • A step-by-step process for uploading datasets via GAME Cloud, and things to take note of

  • Best practices to ensure your dataset works effectively

  • Real-world use cases to inspire your projects

The motivation behind uploading datasets

Uploading custom datasets is essential for tailoring your AI agent’s performance to specific needs. Let’s dive into the motivations behind this:

1️⃣ Customization for Unique Use Cases

Why?

Publicly available datasets or APIs might not fully capture the specific needs of your application (e.g., tracking niche crypto projects or analyzing specific Telegram channels).

When?

• Your AI agent requires domain-specific knowledge (e.g., proprietary market reports, custom research, or industry datasets). • The available public datasets contain irrelevant or generalized data not suited for your niche.

Example:

A project requiring analysis of user sentiment on lesser-known altcoins might need a custom domain knowledge dataset uploads.

2️⃣ Enhancing AI Understanding of Proprietary Content

Why?

If your agent interacts with users based on internal documentation or proprietary content, it needs access to that content to generate accurate responses.

When?

• The agent is required to answer customer or team-specific queries (e.g., FAQs, internal documentation, or project-specific reports). • You want the agent to provide personalized recommendations or insights based on your business data.

Example:

A DeFi protocol may upload a dataset containing platform-specific FAQs, governance proposals, and tokenomics information to enhance user support through the agent.

How to Upload Datasets via GAME Cloud

Follow these steps to upload your dataset seamlessly:

1

Step 1:

Navigate to the section under Agent Knowledge called Dataset.

2

Step 2:

Select Upload Datasets

3

Step 3:

Choose a file in one of the supported formats listed.

4

Step 4:

Upload your file

5

Step 5:

When enabled (toggle switched on), the uploaded dataset will be used as a referencing document by the agent when generating responses. When disabled (toggle switched off), the dataset remains uploaded but is temporarily ignored by the agent. This is useful if you don’t want the agent to refer to this dataset for a moment without deleting it.

6

Step 6:

In the Tweet Enrichment setup section, ensure that “Enable Tweet Enrichment” option is selected.

7

Step 7:

Also check and ensure {{retrieveknowledge}}is enabled.

🗒️ Things to Take Note of

While uploading datasets, here are some key considerations:

The Agent may not properly reference the dataset correctly

Dataset Upload: Best Practices and Tips

Follow these best practices to ensure smooth integration and efficient dataset use:

Structured Organization

Why

AI agents rely on clearly defined sections to parse and retrieve information effectively.

How

Use consistent headers (e.g., # Section: Common Questions) to categorize different data types such as FAQs and tokenomics.

Example

The "Common Questions" section helps the agent identify relevant answers based on user queries without scanning the entire document.

Relevance to Agent Goals

Why

Excessive irrelevant data can degrade the agent's performance by increasing processing time and reducing focus.

How

Include only the sections and data points directly tied to the agent's tasks.

Example

A DeFi support agent may not need token distribution details unless governance or staking queries are common.

Support for Semantic Understanding

Why

Agents powered by NLP benefit from additional context, such as explanatory notes or related information.

How

Add brief definitions or context alongside technical terms where necessary.

Example

Including a short definition of technical term helps the agent explain concepts to users unfamiliar with the term.

Avoidance of Redundancies

Why

Duplicate data can confuse the agent's search and retrieval mechanisms.

How

Perform a dataset audit to remove redundant entries or sections that repeat across multiple documents.

Example

Use unique document titles and well-organized sections to differentiate content clearly.

Compliance and Privacy

Why

Sensitive data, such as wallet addresses or personal information, should be protected to prevent privacy breaches.

How

Anonymize or redact sensitive information where necessary.

Example

Replace identifiable wallet addresses with placeholders in public-facing datasets


Real-World Example Use Cases

Here are some practical use cases to get your brain juices flowing and give you some ideas on how to leverage this feature!

Example 1: Proprietary FAQs for DeFi Protocol Support

Format: TXT (Q&A structure)

# Dataset: DeFi Protocol FAQs
# Purpose: Improve support responses for a decentralized finance (DeFi) platform.

Q: What is the purpose of the $TOKEN in your protocol?
A: The $TOKEN is used for governance, staking rewards, and transaction fee discounts.

Q: How can I stake my tokens?
A: You can stake your tokens via the protocol dashboard by navigating to the "Staking" section and following the on-screen instructions.

Q: Are my funds insured in the event of a smart contract failure?
A: Currently, our protocol offers no direct insurance; however, third-party providers may offer coverage options.

Q: What are governance proposals, and how can I vote?
A: Governance proposals are community-driven initiatives that require token holders to vote. Visit the governance portal to cast your vote.

Key Considerations:

  • Organize the dataset with Q&A pairs for easy retrieval.

  • Add domain-specific terminology to enhance the AI agent's understanding of industry jargon.

  • Handle edge cases, such as variations in user queries

  • Clean the dataset to remove duplicates and irrelevant entries, which may confuse the agent.

Example 2: Crypto Sentiment Analysis (Twitter/Telegram Discussions)

Format: TXT (dataset structure)

# Dataset: Crypto Sentiment on Altcoins
# Source: Twitter & Telegram discussions
# Format: Timestamp | Platform | Username | Text | Sentiment (Positive/Negative/Neutral)

2025-01-10 12:34:56 | Twitter | @CryptoTraderX | "Really bullish on $XYZ! Incredible project." | Positive
2025-01-11 08:23:45 | Telegram | User1234 | "This altcoin is just another scam, not touching it." | Negative
2025-01-12 15:10:00 | Twitter | @AltcoinExpert | "Holding $ABC for long-term gains. Fundamentals look strong." | Positive
2025-01-13 10:12:00 | Telegram | CryptoTalk456 | "Neutral on $LMN right now. Waiting for more updates." | Neutral

Key Considerations:

  • Ensure text is structured consistently with clear delimiters (e.g., |).

  • Add metadata, such as timestamps and platforms, to support time-based analysis.

Example 3: Web 3 Project Support Agent

Format: PDF (Referencing FAQs, Tutorials, User Instructions, Research Paper)

# Section: Common Questions
Q: How do I connect my wallet to the platform?  
A: Click on the "Connect Wallet" button on the top-right corner of the page and follow the instructions.

Q: What is slippage tolerance?  
A: Slippage refers to the price difference between the time an order is placed and when it is executed.

# Section: Tokenomics Overview
- Max Supply: 1,000,000,000 XYZ tokens
- Distribution:
  - 60% Public
  - 20% Team and Development
  - 10% Ecosystem Fund

Key Considerations:

  • Misinterpretation Prevention: Users may phrase the same query differently ("Why is my transaction failing?" vs. "My swap didn’t go through")—train AI to match intent, not just keywords.

  • Security-related questions (e.g., “Is my wallet safe?”) should be handled with caution, linking to official documentation and disclaimers rather than speculative answers.


And that’s a wrap! 🚀

But hey, remember quality > quantity every time!

Your AI agent doesn’t need a data dump—all it needs is clean, structured, and relevant data! WAGMI! 🤖


Stay Connected and Join the Virtuals Community! 🤖 🎈

Cover

X: @GAME_Virtuals

For updates and to join our live streaming jam sessions every Wednesday. Stay in the loop and engage with us in real time!

Cover

Discord: @Virtuals Protocol

Join Discord for tech support and troubleshooting, and don’t miss our GAME Jam session every Wednesday!

Cover

Telegram: @Virtuals Protocol

Join our Telegram group for non-tech support! Whether you need advice, a chat, or just a friendly space, we’re here to help!

Last updated