Apache Doris

Software Development

San Francisco, California 2,347 followers

Apache Doris is an open-source real-time data warehouse based on MPP architecture.

About us

Apache Doris is an open-source real-time data warehouse based on MPP architecture, known for its fast speed and ease of use. It supports real-time data ingestion and real-time query response in both high-concurrency point query and high-throughput analysis scenarios. With it, users can process and analyze large datasets in the blink of an eye. In June 2022, Apache Doris became a full-fledged, top-level project incubated by ASF. It accumulated nearly 600 contributors and more than 20,000 developers are using Apache Doris today. Doris is also used in production within over 2000 companies around the world, trusted by business giants such as AWS, Fuse, JD.com, Lenovo, OPPO, Shoppe, TikTok, Tencent, Vivo, Xiaomi and etc. We welcome more open source technology enthusiasts to join the Apache Doris community and together discover infinite possibilities! Learn more about Apache Doris on Github: https://github.com/apache/doris Join the Apache Doris community on Slack: https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2gmq5o30h-455W226d79zP3L96ZhXIoQ

Website: https://doris.apache.org/
External link for Apache Doris
Industry: Software Development
Company size: 201-500 employees
Headquarters: San Francisco, California
Type: Nonprofit
Founded: 2018

Locations

Primary

San Francisco, California 94102, US

Get directions
Beijing, Beijing 100086, CN

Get directions

Updates

Apache Doris

2,347 followers
5mo
Report this post
📢We are thrilled to announce the release of Apache Doris 2.1.0! For our long-term supportive users, allow me to re-introduce Apache Doris with its amazing new features and substantially improved data writing and query performance! For those who are new to Apache Doris, this is great timing for a proof of concept to see how it performs in your use case! Fasten up and be ready for: 🚶♂️ 100% faster out-of-the-box performance proven by TPC-DS benchmark tests 🚶♀️ Improved data lake analytics capabilities: 4~6 times faster than Trino and Spark 🏃♂️ Solid support for semi-structured data analysis 🏃♀️ Materialized view across multiple tables to accelerate multi-table joins 💃 Enhanced real-time writing efficiency powered by AUTO_INCREMENT column, AUTO PARTITION, forward placement of MemTable, and Group Commit. 🕺 Better workload management for higher performance stability https://lnkd.in/gjVXD6gQ #database #dataengineering #analytics #bigdata #opensource

Another big leap: Apache Doris 2.1.0 is released - Apache Doris

doris.apache.org

4 Comments

Like Comment Share
Apache Doris

2,347 followers
17h
Report this post
The design of the compute-storage decoupled mode of Apache Doris, available in the upcoming version 3.0, highlights the transformation of the FE's in-memory metadata model into a shared metadata service. This approach offers a globally consistent state view, allowing any node to directly submit writes without needing to go through the FE for publishing. During write operations, data is stored in shared storage, while metadata is managed by the metadata service. This effectively controls the number of small files in shared storage. Meanwhile, the real-time write performance for individual tables is nearly on par with that in the compute-storage coupled mode. The system's overall write capacity is no longer limited by the processing power of a single FE node.
Like Comment Share
Apache Doris reposted this

PRINCIPAL

2,158 followers
3d
Report this post
🎯 Léto s daty: Úspory s open-source u PRINCIPAL Ani v létě naše #PRINCIPAL datová laboratoř nezahálí! Představujeme vám novinky, na kterých náš tým intenzivně pracuje. Přiblíží vám to kolega Tomas Novotny, který stojí za našimi nejnovějšími testy. "Cílem naší datové laboratoře je sestavit z open-source komponent stabilní řešení, které našim zákazníkům ušetří náklady na datovou platformu. Pokud zákazník žasne nad příliš vysokým účtem za cloudový data warehouse, my v #PRINCIPAL máme řešení. Přesně pro tento účel testujeme například #Apache Doris - MPP databázi s podobnou filosofií jako má například #Teradata či #Snowflake. Doris si sama konzumuje z Kafky a plní jimi landing vrstvu datového skladu. Databáze se dobře hodí jako platforma pro ODS (operační datový sklad), nebo tam, kde se úvahy točí kolem možnosti využít pro tento účel #Elastic a #Kibanu. Pro testování využíváme mimo jiné i volně dostupná data o hustotě dopravy na vybraných dopravních úsecích, které například kombinujeme s výsledky skórování fotek z webových kamer, za využití AI modelů z #HuggingFace. Výsledné dashboardy jsou vytvořeny v #Grafaně, která nám aktuální stav dopravy zobrazuje po 10 sekundách. Sami zjišťujeme, kolik peněz lze ušetřit za využití dobře ochočených open-source softwarů v on-premis či hybridní variantě." Jaké zkušenosti máte s využitím open-source nástrojů ve vaší datové infrastruktuře?
1 Comment

Like Comment Share
Apache Doris

2,347 followers
4d
Report this post
How Apache Doris makes life easier for data engineers? 🦉 Self-adaptive query optimizer A query optimizer that enables fast performance for most use cases without any manual fine-tuning https://lnkd.in/gFZiieCH 🦉 Auto Partition It supports partitioning data by RANGE or by LIST and further enhances flexibility in data sharding. https://lnkd.in/gHmmqJg5 🦉 Auto Increment A useful feature in dictionary encoding, primary key generation, data updates, and pagination https://lnkd.in/gr6tvnu2 🦉 SQL Dialect Convertor Doris supports multiple SQL dialects, including Presto, Trino, Hive, PostgreSQL, Spark, Oracle, and Clickhouse. Users can directly query data in Doris using these SQL dialects. https://lnkd.in/gMXK5s9k 🦉 X2Doris X2Doris is a visualized tool that facilitates data migration from other OLAP systems to Doris. https://lnkd.in/gqTg-D2M
Like Comment Share
Apache Doris

2,347 followers
5d
Report this post
We just added a What's New page on the Apache Doris website, so you'll be informed of what new tutorials and guides are provided. We've been continuously improving our documentation to provide better user experience and will love to hear your feedback any time. https://lnkd.in/ggsCggbj
Like Comment Share
Apache Doris

2,347 followers
6d
Report this post
A use case of Apache Doris supporting the farm-to-fork journey of a listed fresh agri-produce enterprise: Before Apache Doris, they were using the combination of a real-time data warehouse based on HBase and an offline one based on Hive. With Apache Doris, they integrate their real-time data streams and batch data processing pipelines to simplify their data architecture. 💡 Benefits they reap: ☑️ 3X personnel efficiency: By mastering SQL alone, the entire real-time reporting development can be completed quickly. The computation logic can be changed by simply modifying the SQL. ☑️ $1 million in cost savings: Cost reductions due to the data compression capability and computing performance of Doris, which allows the user to cut down storage costs and labor inputs. ☑️ 30X computing efficiency: compared to Hive ☑️ Convenience in data ingestion: The real-time data ingestion process requires only configuration on the web page, avoiding the tedious operations associated with Flink JAR package modification and upload.
Like Comment Share
Apache Doris

2,347 followers
1w
Report this post
Apache Doris provides native support for several key features of Apache Iceberg: 1️⃣ Multiple Iceberg Catalog types, including Hive Metastore, Hadoop, REST, Glue, Google Dataproc Metastore, and DLF. 2️⃣ Iceberg V1/V2 table formats, and reading Positional Delete and Equality Delete files. 3️⃣ Querying the snapshot history of Iceberg tables through table functions. 4️⃣ Time Travel 5️⃣ Iceberg table engine. Users can directly create, manage Iceberg tables, and write data to them via Apache Doris. 6️⃣ Partition Transform functions, enabling hidden partitioning and schema evolution.
Like Comment Share
Apache Doris reposted this

VeloDB

654 followers
1w
Report this post
The BYOC (Bring Your Own Cloud) mode of VeloDB Cloud provides data warehouse services based on open-source Apache Doris. It now supports GCP (Google Cloud Platform) in addition to AWS. Start your free trial now: https://lnkd.in/g7ZJ9vSR
Like Comment Share
Apache Doris

2,347 followers
1w Edited
Report this post
Apache Doris has evolved itself into a mature Data Lakehouse solution. 🚗 V0.15 1️⃣ Introduced Hive and Iceberg external tables. 🚄 V1.2 1️⃣ Introduced Multi-Catalog, achieving automatic metadata mapping and data access for various data sources. ✈️ V2.1 1️⃣ Enhanced Lakehouse architecture with better reading and writing capabilities of mainstream data lake formats (Hudi, Iceberg, Paimon, etc.). 2️⃣ Introduced compatibility with multiple SQL dialects. 3️⃣ Enabled seamless migration from existing systems to Apache Doris 4️⃣ Integrated the Arrow Flight high-speed reading interface, achieving 100X data transfer efficiency. As we're providing more doc guides about the data lakehouse capabilities of Apache Doris, this is a quick start to query data from Apache Paimon using Doris. https://lnkd.in/gFQPAyVn

Apache Doris & Paimon Quick Start - Apache Doris

doris.apache.org

1 Comment

Like Comment Share
Apache Doris

2,347 followers
1w
Report this post
Welcome to the open source world! Polaris Catalog by Snowflake is now available on GitHub. It enabled interoperability with Apache Doris, which means Doris users can access Iceberg tables in Snowflake via the Polaris Catalog. https://lnkd.in/g2vFNNU7

Polaris Catalog Is Now Open Source

snowflake.com

Like Comment Share

Apache Doris

Software Development

San Francisco, California 2,347 followers

Apache Doris is an open-source real-time data warehouse based on MPP architecture.

About us

Locations

Updates

Another big leap: Apache Doris 2.1.0 is released - Apache Doris

doris.apache.org

Apache Doris & Paimon Quick Start - Apache Doris

doris.apache.org

Polaris Catalog Is Now Open Source

snowflake.com

Join now to see what you are missing

Similar pages

Apache Hudi

DuckDB

Polars

Apache Iceberg

Apache XTable (Incubating)

MotherDuck

StarRocks

MinIO

VeloDB

ClickHouse