Avin Jain’s Post

Founder, CEO at BDB-D&A Platform with DataOps/MLOps/AI/GenAI/Viz

2mo

We explore the exciting new features of BDB Data Pipeline 9.0, designed to enhance capability and usability with a redesigned UI. BDB Pipeline has all key features of #Dataingestion #dataintegration #Scalability #Dataquality #datasecurity #dataOrchestration #datastorage #datacatalog #datagovernance #performanceoptimization #faulttolerance #reliability #Monitoring #logging #Interoperability #FriendlyUX #Documentation #support Key updates in 9.0 version includes: Job Overview: Consolidate all job-related data for simplified management and troubleshooting. Pipeline Overview: Get a comprehensive view of pipeline statuses, including running, interrupted, failed, or ignored pipelines. Enhanced Configuration UI: Separate tabs for pipeline and job settings for easier management. System Pods Details Enhancement: Improved interaction with system logs and detailed system resource insights. Component Version Update: Seamlessly update component versions with each release. Athena Query Executor: Directly read data from AWS Athena, enhancing query execution efficiency. Advanced Sandbox Writer: Support for partitioned directory data management with parts files. Data Metrics Enhancement: Apply specific date ranges for tailored data analysis. ORC File Support: Read and write ORC files in Sandbox, HDFS, and S3 environments. Job List Edit Button: Quickly access and modify job configurations. Job Trigger Component: Initiate Python jobs on-demand with JSON payloads. Script Executor Job: Execute scripts in various programming languages from Git repositories. Python On-Demand Job: Trigger and execute Python jobs as needed. Enhanced Job List Page: View detailed job configurations for improved accessibility. Customizable Pipeline Overview: Enhance your pipeline with customizable color themes and expanded descriptions. Enhanced Kafka Preview Panel: Access data in CSV, Excel, and JSON formats for efficient analysis https://lnkd.in/gDakXJt2

BDB 9.0 - Data Pipeline Features & Enhancements

https://www.youtube.com/

To view or add a comment, sign in

More Relevant Posts

Synapse Software

136 followers
4mo
Report this post
Three things to remember about Cortex: 1. It provides a familiar UI for your Cherwell data 2. It requires NO database changes 3. It requires minimal configuration In today’s video, we show the second one in probably the most ridiculous demo we’ve ever done. NO data changes. NO export/import, NO ETL, NO CSV, yet ALL of the data! Cortex (Cherwell Archive) https://lnkd.in/g8qycQ85 #cherwell #cherwellservicemanagement

NO data changes. NO export/import, NO ETL, NO CSV, yet ALL of the data! Cortex (Cherwell Archive)

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Vengateswaran Arunachalam

Data Architect ✍️ Technical writer 🛠️ Product Developer 💡 Full Stack Data Engineer 👨💼 Teckiy Founder ▶️ Youtube "Simplified Data Engineering"
10mo
Report this post
🚀 ETL - Data Quality Check! 🚀 Below are some high level metrics which need to be consider 1️⃣ Historical Data Check 🕰️ 2️⃣ Incremental Load Verification 🆕 3️⃣ Transformation Accuracy 🔍 4️⃣ Referential Integrity Maintenance 🔗 5️⃣ Latency Monitoring ⏱️ 6️⃣ Error & Rejection Management ⚠️ 7️⃣ Backup & Archive Verification 🗃️ 8️⃣ Security & Compliance Adherence 🛡️ 9️⃣ Performance Optimization 🚀 🔟 Alerting & Monitoring Functionality 🚨 1️⃣1️⃣ Volume Management 📊 1️⃣2️⃣ Data Reconciliation ✅ 1️⃣3️⃣ Usability Assurance 🖥️ 1️⃣4️⃣ Documentation & Metadata Accuracy 📝 🔗 #DataEngineering #ETL #DataQuality #DataMigration #DataManagement
Like Comment
To view or add a comment, sign in
Saikat Dutta

Azure Data Engineer - Senior Specialist
3mo Edited
Report this post
4 ways to make a Senior Data Engineer cry: 1. Disregard Data Quality: No validation, 0 cleansings, No Dupe check. 2. No Reusable Framework: Script every ETL process afresh, who needs efficiency? 3. Ignore Speed: Use single-threaded processes, distributed big data processing is overrated. 4. Bypass Monitoring: No failure alerts in the pipeline. Who needs more emails? (Bonus) - Tell the business all the data for the new dashboard will be ready in a day. #DataEngineeringBlues 😢 #DataEngineering Inspired by the kickass format from Luca Zanna

9 Comments
Like Comment
To view or add a comment, sign in
Abhishek Vaish

Data Engineer | Snowflake | DBT | Airflow | Azure | Power BI
2mo Edited
Report this post
In our project, we've been facing a recurring ETL issue where the dynamic file headers in monthly CSV files cause our ETL pipelines to fail. During my POC on this problem, I encountered key data engineering concepts that will guide our architectural decisions based on our use cases: 1. Schema on write: This approach involves defining the schema before writing data to the database. It's used for structured data where the schema is known beforehand. 2. Schema on read: This approach involves defining the schema after writing data to the database. It's suitable for semi-structured data where the schema is not known in advance. #dataengineer #warehouse #innovasolutions #etlpipeline #bestpractices
Like Comment
To view or add a comment, sign in
Pradeep Mishra

AMI/IoT Data Integration Expert | Building Reliable Pipelines for Metering and Distribution Networks in 30 Days
8mo
Report this post
So we've built this brand new ETL data pipeline to share low-voltage data with network operators and found this BIG GAP several months later. This pipeline is ingesting load-profile data (in batches) from the MDMS system in JSON format and then "processes and transforms" into the client-specific XML/CSV format. Good thing is: We capture the progress of the transformation job and have a UI to monitor the status and retrigger if the job fails. Big Gap: We don't show the load-profile data used in the job over the UI Result: The SIT Team spent too much time figuring out the correctness and wholeness of the output data versus the input data. Did we not know it? Yes, we did, and the MDP team too. But we ignored it, as nobody was concerned about the visibility of the input. Lesson Learned: When implementing transformations, make the inputs and outputs highly visible to keep everyone aligned. #smartmetering #evcharginginfrastructure #vpp #utilities #dataintegration #systemintegrators #energytransition
Like Comment
To view or add a comment, sign in
DataGaps

9,212 followers
1w
Report this post
Are you an ETL Developer drowning in manual data validation? Automate time-consuming processes holding you back and regain your time. https://hubs.ly/Q02JdG0r0 #ETL #DataValidation #DataOpsSuite

Datagaps DataOps Suite: End-to-End Data Validation Platform

https://www.datagaps.com
Like Comment
To view or add a comment, sign in
Biju Joseph

Data Analytics & AI | MCA | EY Tech MBA
8mo
Report this post
Data Integration is a big world with lots of tools.

SSIS / SQL Server Integration Service / Data Integration/ Data Transformation/ On-premises Tool

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
HARIKRISHNA REDDY L

Driving Business Innovation with Data | Skilled in Python, SQL & ETL | Expert in AI, ML & Generative AI | Cloud Solutions Specialist with Azure, AWS & GCP | Building Everyday Experiences with Smart Tech & NexGen AI Tools
3mo
Report this post
🔧🚀 Have you ever wondered how much impact optimized data pipelines can have? By refining ETL processes, I’ve seen processing times cut by up to 50%! Here’s what works: • Automated Testing: Catch issues before they escalate. • Incremental Loading: Save time and resources. • Data Quality Checks: Ensure your data is clean and reliable. What techniques do you use to enhance efficiency? #DataEngineering #Optimization
Like Comment
To view or add a comment, sign in
Suresh Kumar

SSE II ||Java || Spring Boot || React Js || Python || CMS || AWS || Docker || K8s || Linux || Sql, NoSql
5mo
Report this post
Data JPA Relationship (Entity Mapping) | Spring Boot 3.x Tutorial 3 https://lnkd.in/gvjZw5fp

Data JPA Relationship (Entity Mapping) | Spring Boot 3.x Tutorial 3

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Arockia Nirmal Amala Doss

Founder, Data Engineer @ ZippyTec GmbH | Data Migration & Data Engineering Consulting | Data Migration Coaching | AWS Community Builder
3mo Edited
Report this post
𝐖𝐡𝐚𝐭 𝐡𝐚𝐬 𝐛𝐞𝐞𝐧 𝐭𝐡𝐞 𝐦𝐨𝐬𝐭 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐢𝐧𝐠 𝐩𝐫𝐨𝐣𝐞𝐜𝐭 𝐨𝐫 𝐬𝐜𝐞𝐧𝐚𝐫𝐢𝐨 𝐲𝐨𝐮'𝐯𝐞 𝐟𝐚𝐜𝐞𝐝 𝐢𝐧 𝐲𝐨𝐮𝐫 𝐝𝐚𝐭𝐚 𝐜𝐚𝐫𝐞𝐞𝐫 𝐬𝐨 𝐟𝐚𝐫? 𝐋𝐞𝐭'𝐬 𝐥𝐞𝐚𝐫𝐧 𝐟𝐫𝐨𝐦 𝐞𝐚𝐜𝐡 𝐨𝐭𝐡𝐞𝐫! One of the most challenging projects I faced was building a real-time data pipeline to capture changes from a legacy on-premises database and replicate that data to a cloud-based data lake.The key challenge was dealing with the database's outdated technology stack and lack of native change data capture (CDC) capabilities. We couldn't just plug in a off-the-shelf CDC tool - we had to get creative.After evaluating a few options, we ended up using a combination of database triggers, custom ETL scripts, and a message queue to detect and propagate the changes. It was a complex, multi-step process, but it allowed us to keep the data flowing without disrupting the existing systems. What was once a slow, manual process became a living, breathing data ecosystem that fueled better decision-making.
Like Comment
To view or add a comment, sign in

5,362 followers

View Profile Follow

Avin Jain’s Post

BDB 9.0 - Data Pipeline Features & Enhancements

https://www.youtube.com/

More from this author

Enabling GCCs Data Analytics Teams

Intense Competition in Data Analytics Space

How to Secure a Fortune 50 Client with BDB’s Data and Analytics Platform

Explore topics

Avin Jain’s Post

BDB 9.0 - Data Pipeline Features & Enhancements

https://www.youtube.com/

More Relevant Posts

NO data changes. NO export/import, NO ETL, NO CSV, yet ALL of the data! Cortex (Cherwell Archive)

https://www.youtube.com/

SSIS / SQL Server Integration Service / Data Integration/ Data Transformation/ On-premises Tool

https://www.youtube.com/

Data JPA Relationship (Entity Mapping) | Spring Boot 3.x Tutorial 3

https://www.youtube.com/

More from this author

Enabling GCCs Data Analytics Teams

Intense Competition in Data Analytics Space

How to Secure a Fortune 50 Client with BDB’s Data and Analytics Platform

Explore topics