🧠
GAME by Virtuals
Get API KeyGAME SDK
  • Introducing GAME
  • GAME Overview
    • Transcript of GAME Knowledge Session
  • Release Note
  • GAME Cloud
  • GAME SDK
  • How to
    • Articles
      • Prompt Design Playbook for Agent Configuration via GAME
      • Simulate Reaction & Output in GAME Cloud
      • GAME Cloud Custom Functions: Retrieving Articles Using Dev.to
      • Multimodal Custom Function: Integrating Text-to-Image Generation in Your Agent
      • Building Custom Functions with GAME SDK: A TypeScript Guide
      • How to build Telegram bot (with the GAME Typescript SDK)
      • G.A.M.E Cloud or G.A.M.E SDK? Decoding the Right Choice for Your Project
      • GAME Cloud - How to Define Reply Worker and Worker Prompts
      • Dataset Upload for AI Agents in GAME Cloud: Guidelines, Common Issues, and Best Practices
    • Video Tutorials
  • Commonly Asked Questions
    • My Agent is not tweeting
  • GAME Use Cases
Powered by GitBook
On this page
  • The motivation behind uploading datasets
  • How to Upload Datasets via GAME Cloud
  • 🗒️ Things to Take Note of
  • Dataset Upload: Best Practices and Tips
  • Real-World Example Use Cases
  • And that’s a wrap! 🚀
  • Stay Connected and Join the Virtuals Community! 🤖 🎈
  1. How to
  2. Articles

Dataset Upload for AI Agents in GAME Cloud: Guidelines, Common Issues, and Best Practices

By: Joey Lau

GM builders!

Welcome to this guide on custom dataset uploads. We’ll walk you through the essentials, covering:

  • The motivation behind uploading datasets

  • A step-by-step process for uploading datasets via GAME Cloud, and things to take note of

  • Best practices to ensure your dataset works effectively

  • Real-world use cases to inspire your projects

The motivation behind uploading datasets

Uploading custom datasets is essential for tailoring your AI agent’s performance to specific needs. Let’s dive into the motivations behind this:

1️⃣ Customization for Unique Use Cases

Why?

Publicly available datasets or APIs might not fully capture the specific needs of your application (e.g., tracking niche crypto projects or analyzing specific Telegram channels).

When?

• Your AI agent requires domain-specific knowledge (e.g., proprietary market reports, custom research, or industry datasets). • The available public datasets contain irrelevant or generalized data not suited for your niche.

Example:

A project requiring analysis of user sentiment on lesser-known altcoins might need a custom domain knowledge dataset uploads.

2️⃣ Enhancing AI Understanding of Proprietary Content

Why?

If your agent interacts with users based on internal documentation or proprietary content, it needs access to that content to generate accurate responses.

When?

• The agent is required to answer customer or team-specific queries (e.g., FAQs, internal documentation, or project-specific reports). • You want the agent to provide personalized recommendations or insights based on your business data.

Example:

A DeFi protocol may upload a dataset containing platform-specific FAQs, governance proposals, and tokenomics information to enhance user support through the agent.

How to Upload Datasets via GAME Cloud

Follow these steps to upload your dataset seamlessly:

1

Step 1:

Navigate to the section under Agent Knowledge called Dataset.

2

Step 2:

Select Upload Datasets

3

Step 3:

Choose a file in one of the supported formats listed.

4

Step 4:

Upload your file

5

Step 5:

When enabled (toggle switched on), the uploaded dataset will be used as a referencing document by the agent when generating responses. When disabled (toggle switched off), the dataset remains uploaded but is temporarily ignored by the agent. This is useful if you don’t want the agent to refer to this dataset for a moment without deleting it.

6

Step 6:

In the Tweet Enrichment setup section, ensure that “Enable Tweet Enrichment” option is selected.

7

Step 7:

Also check and ensure {{retrieveknowledge}}is enabled.

🗒️ Things to Take Note of

While uploading datasets, here are some key considerations:

The Agent may not properly reference the dataset correctly

  • Cause: The dataset may not be correctly recognized or retrieved by the agent.

  • Solutions:

    • Solution 1: Delete the dataset and re-upload it, ensuring correct format and structure compliance.

    • Solution 2: Enable the Retrieve Knowledge option under the Tweet Enrichment segment to allow the agent to access the uploaded dataset.

    • Solution 3: Ensure that the agent's goal is properly linked to the uploaded dataset, allowing it to reference and utilize the dataset when generating responses. May consider to add/ refer to sample prompt below:

      • Designed to enhance tweet enrichment by leveraging uploaded datasets. When generating responses, always check if relevant information exists in the uploaded dataset.

Dataset Upload Constraints

Here are some important platform limits to be aware of:

1️⃣ Supported Formats

The GAME Sandbox currently supports the following file formats for dataset uploads:

  • PDF: Often used for large documents or reports.

  • TXT: Best for simple, structured text data or logs.

  • CSV: Ideal for tabular data with rows and columns. This format works well for numerical data, time series, and datasets requiring structured relationships.

  • HTML: Useful when the dataset involves web-based content, such as blog articles or structured pages. HTML files can retain formatting and metadata, making them beneficial for agents focused on web scraping or content parsing

  • XLSX: Suitable for complex spreadsheets with multiple sheets, formulas, and structured data. This format is great for datasets that require various data types and categorization.

2️⃣ File Size Limits

  • The maximum file size for each upload is 10 MB.

  • Uploading files larger than this limit may cause failures or performance degradation. For larger datasets, consider breaking the file into multiple smaller chunks.

3️⃣ File Amount Restrictions

  • It is recommended to limit the total number of uploaded files to maintain system efficiency. Ideally, the file amount should be <=5.

  • Uploading excessive numbers of files can result in increased processing time, system lag, or errors during dataset retrieval.

Dataset Upload: Best Practices and Tips

Follow these best practices to ensure smooth integration and efficient dataset use:

Structured Organization

Why

AI agents rely on clearly defined sections to parse and retrieve information effectively.

How

Use consistent headers (e.g., # Section: Common Questions) to categorize different data types such as FAQs and tokenomics.

Example

The "Common Questions" section helps the agent identify relevant answers based on user queries without scanning the entire document.

Relevance to Agent Goals

Why

Excessive irrelevant data can degrade the agent's performance by increasing processing time and reducing focus.

How

Include only the sections and data points directly tied to the agent's tasks.

Example

A DeFi support agent may not need token distribution details unless governance or staking queries are common.

Support for Semantic Understanding

Why

Agents powered by NLP benefit from additional context, such as explanatory notes or related information.

How

Add brief definitions or context alongside technical terms where necessary.

Example

Including a short definition of technical term helps the agent explain concepts to users unfamiliar with the term.

Avoidance of Redundancies

Why

Duplicate data can confuse the agent's search and retrieval mechanisms.

How

Perform a dataset audit to remove redundant entries or sections that repeat across multiple documents.

Example

Use unique document titles and well-organized sections to differentiate content clearly.

Compliance and Privacy

Why

Sensitive data, such as wallet addresses or personal information, should be protected to prevent privacy breaches.

How

Anonymize or redact sensitive information where necessary.

Example

Replace identifiable wallet addresses with placeholders in public-facing datasets


Real-World Example Use Cases

Here are some practical use cases to get your brain juices flowing and give you some ideas on how to leverage this feature!

Example 1: Proprietary FAQs for DeFi Protocol Support

Format: TXT (Q&A structure)

# Dataset: DeFi Protocol FAQs
# Purpose: Improve support responses for a decentralized finance (DeFi) platform.

Q: What is the purpose of the $TOKEN in your protocol?
A: The $TOKEN is used for governance, staking rewards, and transaction fee discounts.

Q: How can I stake my tokens?
A: You can stake your tokens via the protocol dashboard by navigating to the "Staking" section and following the on-screen instructions.

Q: Are my funds insured in the event of a smart contract failure?
A: Currently, our protocol offers no direct insurance; however, third-party providers may offer coverage options.

Q: What are governance proposals, and how can I vote?
A: Governance proposals are community-driven initiatives that require token holders to vote. Visit the governance portal to cast your vote.

Key Considerations:

  • Organize the dataset with Q&A pairs for easy retrieval.

  • Add domain-specific terminology to enhance the AI agent's understanding of industry jargon.

  • Handle edge cases, such as variations in user queries

  • Clean the dataset to remove duplicates and irrelevant entries, which may confuse the agent.

Example 2: Crypto Sentiment Analysis (Twitter/Telegram Discussions)

Format: TXT (dataset structure)

# Dataset: Crypto Sentiment on Altcoins
# Source: Twitter & Telegram discussions
# Format: Timestamp | Platform | Username | Text | Sentiment (Positive/Negative/Neutral)

2025-01-10 12:34:56 | Twitter | @CryptoTraderX | "Really bullish on $XYZ! Incredible project." | Positive
2025-01-11 08:23:45 | Telegram | User1234 | "This altcoin is just another scam, not touching it." | Negative
2025-01-12 15:10:00 | Twitter | @AltcoinExpert | "Holding $ABC for long-term gains. Fundamentals look strong." | Positive
2025-01-13 10:12:00 | Telegram | CryptoTalk456 | "Neutral on $LMN right now. Waiting for more updates." | Neutral

Key Considerations:

  • Ensure text is structured consistently with clear delimiters (e.g., |).

  • Add metadata, such as timestamps and platforms, to support time-based analysis.

Example 3: Web 3 Project Support Agent

Format: PDF (Referencing FAQs, Tutorials, User Instructions, Research Paper)

# Section: Common Questions
Q: How do I connect my wallet to the platform?  
A: Click on the "Connect Wallet" button on the top-right corner of the page and follow the instructions.

Q: What is slippage tolerance?  
A: Slippage refers to the price difference between the time an order is placed and when it is executed.

# Section: Tokenomics Overview
- Max Supply: 1,000,000,000 XYZ tokens
- Distribution:
  - 60% Public
  - 20% Team and Development
  - 10% Ecosystem Fund

Key Considerations:

  • Misinterpretation Prevention: Users may phrase the same query differently ("Why is my transaction failing?" vs. "My swap didn’t go through")—train AI to match intent, not just keywords.

  • Security-related questions (e.g., “Is my wallet safe?”) should be handled with caution, linking to official documentation and disclaimers rather than speculative answers.


And that’s a wrap! 🚀

But hey, remember quality > quantity every time!

Your AI agent doesn’t need a data dump—all it needs is clean, structured, and relevant data! WAGMI! 🤖


Stay Connected and Join the Virtuals Community! 🤖 🎈

PreviousGAME Cloud - How to Define Reply Worker and Worker PromptsNextVideo Tutorials

Last updated 2 months ago

Cover

X: @GAME_Virtuals

https://x.com/GAME_Virtuals

For updates and to join our live streaming jam sessions every Wednesday. Stay in the loop and engage with us in real time!

Cover

Discord: @Virtuals Protocol

http://discord.gg/virtualsio

Join Discord for tech support and troubleshooting, and don’t miss our GAME Jam session every Wednesday!

Cover

Telegram: @Virtuals Protocol

https://t.me/virtuals

Join our Telegram group for non-tech support! Whether you need advice, a chat, or just a friendly space, we’re here to help!