Skip to main content
Version: Current

File Upload Ingestion

Overview

File Upload ingestion allows organizations to upload and semantically process documents, media files, and other supported content types for use within IB-X Conversational Agents.

The ingestion process extracts content from uploaded files, generates semantic embeddings, and stores the processed knowledge for Agent-specific retrieval experiences.

Knowledge ingested through this workflow is available only to the Agent from which the ingestion was configured.

File Upload ingestion is performed through the Knowledge Base Configuration wizard available inside the Agent Designer.

The workflow allows users to:

  • Upload one or multiple files
  • Select content extraction types
  • Configure specialized semantic collections
  • Select files for ingestion
  • Process and index uploaded content

Configuring File Upload Ingestion

To configure file upload ingestion:

  • Open the required Agent in the Designer
  • Click the Ingestions button available in the designer canvas
  • Click Add Ingestion

The Knowledge Base Configuration wizard is displayed.


Step 1 — Source

The Source step allows users to choose the type of knowledge source to ingest.

Select:

File Upload

This option enables ingestion of uploaded files and media content.

Other supported source types may include:

Source TypeDescription
Website URLCrawl and ingest website content
File UploadUpload and process supported files

After selecting File Upload, continue to the next step.


Step 2 — Details

The Details step captures the file upload ingestion configuration.


Data Source Name

Specify a friendly display name for the ingestion source.

This name is displayed in the Ingestions dashboard and helps identify the configured file ingestion source.

Example:

  • IB Documents

Upload Files

The Upload Files section allows users to upload one or multiple files for semantic ingestion.

Users can:

  • Click to browse and select files
  • Drag and drop files directly into the upload area

Supported File Formats

The platform currently supports ingestion of the following formats:

Format TypeSupported Formats
Text DocumentsTXT, DOC, DOCX
PresentationsPPT, PPTX
SpreadsheetsXLS, XLSX
PDF DocumentsPDF

Additional formats may be supported depending on the ingestion engine configuration.

After uploading the required files, click Continue.


Step 3 — Content

The Content step controls what content should be extracted and processed from the uploaded files.

This step includes:

  • Content Types
  • Additional Collections
  • Uploaded Files Structure
  • Content Summary

Content Types

Select the types of content that should be extracted during ingestion.

Supported content types currently include:

Content TypeDescription
textWritten text extracted from uploaded files
imageImages discovered within uploaded files
videoVideo content extracted from uploaded media
audioAudio content extracted during processing
documentStructured document content

Multiple content types can be selected depending on the ingestion requirements.


Additional Collections

Additional Collections allow the ingestion engine to create specialized semantic groupings in addition to the primary text content.

Supported collections currently include:

CollectionDescription
ProductsProduct-related content and metadata
FAQsFrequently asked questions
API ReferencesTechnical API documentation
Code SnippetsSource code and implementation examples

These collections help improve downstream semantic retrieval and contextual relevance.


Uploaded Files Structure

The Uploaded Files Structure section displays the uploaded files in a selectable hierarchy.

Users can:

  • Select specific uploaded files
  • Select all uploaded files
  • Include or exclude individual files from processing

Only the selected files will be processed during ingestion.

This allows organizations to precisely control which uploaded content becomes part of the Agent knowledge store.


Content Summary

The Summary section displays ingestion metrics based on the selected content.

Typical metrics include:

MetricDescription
Selected FilesNumber of files selected for ingestion
Content TypesNumber of enabled content types
Total FilesTotal uploaded files available in the ingestion source

The summary dynamically updates based on the current content selection.


Starting Ingestion

After completing content selection, click Process to begin ingestion.

The ingestion engine performs:

  1. File processing
  2. Content extraction
  3. Content normalization
  4. Semantic chunk generation
  5. Embedding creation
  6. Vector storage
  7. Metadata persistence

The processed knowledge becomes available to the current Conversational Agent after ingestion completes successfully.


Monitoring Ingestion

Once ingestion begins, the configured source appears in the Ingestions dashboard.

The dashboard provides visibility into:

  • Ingestion status
  • Uploaded files
  • Ingestion runs
  • Runtime metrics
  • File-level operations

Users can later:

  • Review ingestion runs
  • Rerun ingestion
  • Delete uploaded files
  • Delete ingestion runs
  • Edit ingestion configuration

Notes

  • File Upload ingestion is Agent-specific.
  • Knowledge ingested for one Agent is not shared with other Agents.
  • Ingestion usage contributes to the environment-level ingestion quota.
  • The Characters Limit is governed by the DATA_INGESTION_LIMIT subscription entitlement.