New Data Loading Tool Saves Time and Speeds Data Pipelines Between Popular Data Stores
SAN FRANCISCO, CA–(Marketwired – Dec 30, 2014) – MemSQL, the leader in real-time and historical Big Data analytics, today introduced a new productivity tool that dramatically improves data ingest from popular data stores like Amazon Web Services S3 and the Hadoop Distributed File System (HDFS). Whereas typical data loading often requires multiple steps, the MemSQL Loader enables direct streaming from the originating datastore in a single transfer. MemSQL allows for multiple parallel input streams, further increasing performance and reducing time-consuming repetitive operations. The MemSQL Loader is also available as open source, providing developers the ability to adapt it to their favorite data source.
“The MemSQL Loader is another innovation of simplifying MemSQL implementations with production data workflows,” said Nikita Shamgunov, Chief Technology Officer and co-founder, MemSQL. “After working with customers during their MemSQL deployments, we found a simple way to eliminate steps in data pipelines, saving them time and reducing complexity,” he continued. “By streaming directly from popular data stores like Amazon S3 and HDFS, we offer customers an easy way to get started, and an efficient way to integrate the real-time transactions and analytics of MemSQL into existing environments.”
Data ingest often involves multiple files or objects, particularly with scalable storage services like S3. In certain cases, customers may have hundreds or thousands of files to import. Other import methods operate at the job level, meaning that if just one file fails, the entire job must be restarted. MemSQL Loader supports loading batches of thousands of files or objects automatically without having to specify files individually. This enables a synchronization path, only loading new or changed files as they are updated, or restarting at a specific file in case of any import issues.
MemSQL Loader automates loading processes and enables queuing of jobs further simplifying ingest. Now MemSQL administrators have a thorough loading solution that eliminates steps from the process, scales performance across a distributed database, monitors file level granularity, and offers connectivity to common data stores like S3 and HDFS.
To download MemSQL Loader, visit the MemSQL Github site.
Read the technical blog post at: http://blog.memsql.com/memsql-loader/
MemSQL enables companies to merge real-time and historical Big Data analytics with its distributed in-memory database. MemSQL instantly accesses data through a familiar SQL interface on a horizontally scalable architecture using commodity hardware across on-premises or cloud deployments.
Innovative enterprises with data intensive environments use MemSQL to accelerate insights, extract previously untapped value in their data, and drive new revenue opportunities. Based in San Francisco, MemSQL is a Y Combinator company funded by Accel Partners, Khosla Ventures, First Round Capital, and Data Collective.