Invest In Crypto News
  • Home
  • Latest News
    • Bitcoin News
    • Altcoin News
    • Ethereum News
    • Blockchain News
    • Doge News
    • NFT News
    • Video
    • Market Analysis
    • Business
    • Finance
    • Politics
    • Mining
    • Regulation
    • Technology
  • Top 10 Cryptos
  • Market Cap List
  • IC DAO
  • Donations
  • Contact
  • Buy Crypto
  • IC DAO
No Result
View All Result
Invest In Crypto News
  • Home
  • Latest News
    • Bitcoin News
    • Altcoin News
    • Ethereum News
    • Blockchain News
    • Doge News
    • NFT News
    • Video
    • Market Analysis
    • Business
    • Finance
    • Politics
    • Mining
    • Regulation
    • Technology
  • Top 10 Cryptos
  • Market Cap List
  • IC DAO
  • Donations
  • Contact
  • Buy Crypto
  • IC DAO
No Result
View All Result
Invest In Crypto News
No Result
View All Result

NVIDIA Unveils Blueprint for Enterprise-Scale Multimodal Document Retrieval Pipeline

CryptoExpert by CryptoExpert
August 30, 2024
in Blockchain News
0
Nvidia's Soaring Data Center Revenue Signals Strong AI and GPU Market Position
  • Facebook
  • Twitter
  • Pinterest


You might also like

Key Features to Prioritize in In-House Legal Software

AI Reshapes Contract Drafting for Legal Teams

LG Electronics, Arbitrum Launch Blockchain Ad Network



Caroline Bishop
Aug 30, 2024 01:27

NVIDIA introduces an enterprise-scale multimodal document retrieval pipeline using NeMo Retriever and NIM microservices, enhancing data extraction and business insights.





In an exciting development, NVIDIA has unveiled a comprehensive blueprint for building an enterprise-scale multimodal document retrieval pipeline. This initiative leverages the company’s NeMo Retriever and NIM microservices, aiming to revolutionize how businesses extract and utilize vast amounts of data from complex documents, according to NVIDIA Technical Blog.

Harnessing Untapped Data

Every year, trillions of PDF files are generated, containing a wealth of information in various formats such as text, images, charts, and tables. Traditionally, extracting meaningful data from these documents has been a labor-intensive process. However, with the advent of generative AI and retrieval-augmented generation (RAG), this untapped data can now be efficiently utilized to uncover valuable business insights, thereby enhancing employee productivity and reducing operational costs.

The multimodal PDF data extraction blueprint introduced by NVIDIA combines the power of the NeMo Retriever and NIM microservices with reference code and documentation. This combination allows for accurate extraction of knowledge from massive volumes of enterprise data, enabling employees to make informed decisions swiftly.

Building the Pipeline

The process of building a multimodal retrieval pipeline on PDFs involves two key steps: ingesting documents with multimodal data and retrieving relevant context based on user queries.

okex

Ingesting Documents

The first step involves parsing PDFs to separate different modalities such as text, images, charts, and tables. Text is parsed as structured JSON, while pages are rendered as images. The next step is to extract textual metadata from these images using various NIM microservices:


nv-yolox-structured-image: Detects charts, plots, and tables in PDFs.
DePlot: Generates descriptions of charts.
CACHED: Identifies various elements in graphs.
PaddleOCR: Transcribes text from tables and charts.

After extracting the information, it is filtered, chunked, and stored in a VectorStore. The NeMo Retriever embedding NIM microservice converts the chunks into embeddings for efficient retrieval.

Retrieving Relevant Context

When a user submits a query, the NeMo Retriever embedding NIM microservice embeds the query and retrieves the most relevant chunks using vector similarity search. The NeMo Retriever reranking NIM microservice then refines the results to ensure accuracy. Finally, the LLM NIM microservice generates a contextually relevant response.

Cost-Effective and Scalable

NVIDIA’s blueprint offers significant benefits in terms of cost and stability. The NIM microservices are designed for ease of use and scalability, allowing enterprise application developers to focus on application logic rather than infrastructure. These microservices are containerized solutions that come with industry-standard APIs and Helm charts for easy deployment.

Moreover, the full suite of NVIDIA AI Enterprise software accelerates model inference, maximizing the value enterprises derive from their models and reducing deployment costs. Performance tests have shown significant improvements in retrieval accuracy and ingestion throughput when using NIM microservices compared to open-source alternatives.

Collaborations and Partnerships

NVIDIA is partnering with several data and storage platform providers, including Box, Cloudera, Cohesity, DataStax, Dropbox, and Nexla, to enhance the capabilities of the multimodal document retrieval pipeline.

Cloudera

Cloudera’s integration of NVIDIA NIM microservices in its AI Inference service aims to combine the exabytes of private data managed in Cloudera with high-performance models for RAG use cases, offering best-in-class AI platform capabilities for enterprises.

Cohesity

Cohesity’s collaboration with NVIDIA aims to add generative AI intelligence to customers’ data backups and archives, enabling quick and accurate extraction of valuable insights from millions of documents.

Datastax

DataStax aims to leverage NVIDIA’s NeMo Retriever data extraction workflow for PDFs to enable customers to focus on innovation rather than data integration challenges.

Dropbox

Dropbox is evaluating the NeMo Retriever multimodal PDF extraction workflow to potentially bring new generative AI capabilities to help customers unlock insights across their cloud content.

Nexla

Nexla aims to integrate NVIDIA NIM in its no-code/low-code platform for Document ETL, enabling scalable multimodal ingestion across various enterprise systems.

Getting Started

Developers interested in building a RAG application can experience the multimodal PDF extraction workflow through NVIDIA’s interactive demo available in the NVIDIA API Catalog. Early access to the workflow blueprint, along with open-source code and deployment instructions, is also available.

Image source: Shutterstock



Source link

  • Facebook
  • Twitter
  • Pinterest
CryptoExpert

CryptoExpert

Recommended For You

Key Features to Prioritize in In-House Legal Software

by CryptoExpert
June 13, 2026
0
10BedICU Leverages OpenAI's API to Revolutionize Critical Care in India

Terrill Dicki Jun 12, 2026 22:33 What to prioritize in in-house legal software: AI capabilities, document review, integration, and enterprise-grade security for enhanced efficiency. ...

Read more

AI Reshapes Contract Drafting for Legal Teams

by CryptoExpert
June 13, 2026
0
Pyth Network Integrates Price Oracles with IOTA EVM

Iris Coleman Jun 12, 2026 14:30 AI-assisted contract drafting is transforming legal workflows by speeding up processes, enhancing quality, and shifting lawyers' roles toward...

Read more

LG Electronics, Arbitrum Launch Blockchain Ad Network

by CryptoExpert
June 12, 2026
0
Cointelegraph

South Korean tech giant LG Electronics is working with the Ethereum layer-2 network Arbitrum to build a blockchain-based advertising network aimed at serving the digital ad industry. Arbitrum would...

Read more

TRM Warns of World Cup Crypto Scams Targeting Fans

by CryptoExpert
June 12, 2026
0
Cointelegraph

TRM Labs warned that crypto scammers are targeting FIFA World Cup fans through fake ticketing sites, fixed-match betting schemes and event-themed crypto promotions. The blockchain intelligence company said it...

Read more

Binance Launches bStocks on BNB Chain: Trade Tokenized US Equities 24/7

by CryptoExpert
June 12, 2026
0
BNB Chain Resolves BscScan Lag Issue, opBNB Still Undergoing Fixes

Terrill Dicki Jun 11, 2026 14:27 Binance debuts bStocks on BNB Chain, enabling 24/7 trading of tokenized US stocks with zero fees and self-custody...

Read more
Next Post
Judge Dismisses $258 Billion Claim

Judge Dismisses $258 Billion Claim

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Browse by Category

  • Altcoin News
  • Bitcoin News
  • Blockchain News
  • Business
  • Doge News
  • Ethereum News
  • Finance
  • Market Analysis
  • Mining
  • NFT News
  • Politics
  • Regulation
  • Technology
  • Trending Cryptos
  • Video

Sitemap

  • Market Cap
  • Donations
  • Trading
  • Mining
  • Contact

Legal Information

  • Privacy Policy
  • Anti-Spam Policy
  • Copyright Notice
  • DMCA Compliance
  • Social Media Disclaimer
  • Terms Of Service

Categories

  • Altcoin News
  • Bitcoin News
  • Blockchain News
  • Business
  • Doge News
  • Ethereum News
  • Finance
  • Market Analysis
  • Mining
  • NFT News
  • Politics
  • Regulation
  • Technology
  • Trending Cryptos
  • Video

© Copyright 2024 InvestInCryptoNews.com

No Result
View All Result
  • Home
  • Latest News
    • Bitcoin News
    • Altcoin News
    • Ethereum News
    • Blockchain News
    • Doge News
    • NFT News
    • Video
    • Market Analysis
    • Business
    • Finance
    • Politics
    • Mining
    • Regulation
    • Technology
  • Top 10 Cryptos
  • Market Cap List
  • IC DAO
  • Donations
  • Contact
  • Buy Crypto
  • IC DAO

© Copyright 2024 InvestInCryptoNews.com

This website is using cookies to improve the user-friendliness. You agree by using the website further.

Privacy policy
bitcoin
Bitcoin (BTC) $ 64,518.00
ethereum
Ethereum (ETH) $ 1,682.70
tether
Tether (USDT) $ 0.999526
bnb
BNB (BNB) $ 609.20
usd-coin
USDC (USDC) $ 0.999809
xrp
XRP (XRP) $ 1.15
solana
Solana (SOL) $ 69.11
tron
TRON (TRX) $ 0.317029
figure-heloc
Figure Heloc (FIGR_HELOC) $ 1.02
staked-ether
Lido Staked Ether (STETH) $ 2,265.05

Pin It on Pinterest

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?