SlideShare a Scribd company logo
Data Integration Basics:
Merging & Joining
Jovita Chan
Technical Support Specialist
Safe Software Inc.
Sara Mak
Technical Support Specialist
Safe Software Inc.
Welcome to Livestorm.
A few ways to engage with us during the webinar:
Audio issues? Click this for 4 simple
troubleshooting steps.
Data silos are often a result of
multiple data formats.
Just because you have different kinds of data,
doesn’t mean they can’t be brought together.
Poll:
What are your biggest
challenge when it comes to
merging and joining data?
Merging and joining disparate
datasets can be streamlined!
FME transformers can do the work for you.
But there are
many paths
you can take
in FME.
Agenda
1 Introduction to Merging & Joining
FeatureJoiner & FeatureMerger
DatabaseJoiner
2 Merging Attributes of Multiple Datasets
InlineQuerier
SQLExecutor & SQLCreator
3 Resources & Next Steps
4 Q&A
Agenda
Flash Demo
One platform, two technologies
FME Form FME Flow
Data Movement and transformations
(“ETL”) workflows are built here.
Brings life to FME Form workflows
FME Flow Hosted
Safe Software managed FME Flow
fme.safe.com/platform
FME Enterprise Integration Platform
Safe & FME
Number
of
supported
data
types
in
FME
1995 2000 2005 2010 2015 2020 2023…
10
100
300
500
GIS
CAD
Database
XML
Raster
3D
BIM
Web
Point
Cloud
Cloud
Big
Data
IOT
Gaming
BI
Indoor
Mapping
AR/VR
Generative
AI
Cloud
Native
Tabular
Merge and Join ANY
Type of Data
BIM and Facilities Management Integration
IBM Shared Parameters:
● IBM.Name
● IBM.Description
● ...
Tririga
BIM
Integrator
ArcGIS
Indoors
Introduction
to Merging &
Joining
Merging or Joining Data
in FME
When data is merged, the attributes from
each dataset are combined together
based on a common identifier and then
output as a single fused dataset.
This is different from appending data.
Terminology
Append/Union Merge/Join
Easy Peasy in FME! Todayʼs Focus
Combine Vertically Combine Horizontally
Append
Merge
Terminology
Append/Union Merge/Join
Easy Peasy in FME! Todayʼs Focus
Combine Vertically Combine Horizontally
Append
Merge
Selecting a
Transformer
Here are some of our
favourites joining data…
The Two Categories: SQL and SQL-Free
SQL Transformers for
Joining & Merging
Require knowledge of SQL.
Includes:
● InlineQuerier
● SQLCreator
● SQLExecutor
SQL-Free Transformers for
Joining & Merging
Same functionality of SQL
transformers. Can be set up easily
without any database knowledge.
Includes:
● FeatureJoiner
● FeatureMerger
● FeatureReader
● DatabaseJoiner
Follow the Flow Chart
1) Is all of my data already inside the workspace?
● If yes, follow the INTERNAL path
● If no, follow the EXTERNAL path
Inspect your data and ask yourself:
FME Community: Merging or Joining Spreadsheet or Database Data
Follow the Flow Chart
1) Is all of my data already inside the workspace?
● If yes, follow the INTERNAL path
● If no, follow the EXTERNAL path
Inspect your data and ask yourself:
2) Does the FeatureJoiner for internal or DatabaseJoiner
for external work for my data?
● If yes, Great! End here.
● If no, continue to question 3
FME Community: Merging or Joining Spreadsheet or Database Data
Follow the Flow Chart
1) Is all of my data already inside the workspace?
● If yes, follow the INTERNAL path
● If no, follow the EXTERNAL path
Inspect your data and ask yourself:
2) Does the FeatureJoiner for internal or DatabaseJoiner
for external work for my data?
● If yes, Great! End here.
● If no, continue to question 3
3) Do I know or want to use SQL?
● If yes, see the transformers in the green SQL box
● If no, see the transformers in the blue box
FME Community: Merging or Joining Spreadsheet or Database Data
Demo
FeatureJoiner &
FeatureMerger
FeatureJoiner
Tutorial Documentation
FeatureMerger
Results in a single match by default
For 1:M, generate a list of matches for
each Requestor by enabling:
● Process Duplicate Suppliers
● Generate List
Tutorial: Working with List Attributes
Tutorial Documentation
FeatureJoiner
● Simpler, and performs
faster than FeatureMerger
● SQL-like terminology
● Handles cardinalities easier than the
FeatureMerger (ie. 1:M, M:N and
M:1)
● Supports multiple matches per
source (by default, 1:M results in a
record for each match)
FeatureMerger
● Handling 1:M joins is a bit more work
● Default: Single match is output for
1:M relationship
● Many Suppliers can be merged on to
a single Requestor.
● Generate lists of matches
● Can create geometries!
Demo
DatabaseJoiner
Midstream Join with DatabaseJoiner
Provides the ability to form a join against a database (incl. CSV & Excel) with an existing dataset in the
workspace.
Read in database formats, Excel or CSV
Matching and accumulation options
Let the database do the work by pre-filtering a
subset of the table you’re reading into the
workspace
Tutorial Documentation
DatabaseJoiner
● Non-Blocking - Joins data
as each row flows by
● Connect to and join to database tables
midstream with one easy transformer!
● Can also connect to Excel and CSV
● Select which attributes you want to join
● Explicit cardinality parameters
● Good for quick lookups - caches first
5000 records
FeatureMerger/Joiner
● Blocking - Wait until all data has been
read before doing processing
● All FME readers can be used
○ No need for anything to be in a
database
● Cardinality options vary
DatabaseJoiner
● Non-Blocking - Joins data
as each row flows by
● Data is external to the workspace and can
be changed without editing the
workspace
● Can merge in multiple attributes
● Explicit cardinality parameters
● Best if the join data is larger or changing
AttributeValueMapper
● Non-Blocking - Joins data
as each row flows by
● No external dependencies
○ Data to lookup is stored in the
workspace
○ Can be imported at design time
from an external source
● Only a single attribute can be added
● 1:1 lookups only
● Very fast
● Best for small, simple, stable mappings
Midstream Join with FeatureReader
● Simpler midstream option
● Join against any database or
spatial format
● Typically used for spatial joins
(more about spatial joins in
Part II)
● Also used to perform tabular
joins with database tables
Midstream Join with FeatureReader
1. Set the WHERE Clause to:
“<Incoming Attribute>” = ʻ@Value(<Initiator Attribute>)ʼ
2. Set Accumulation Mode to Merge Initiator and Result
Documentation
DatabaseJoiner
● Connect to and join to database tables
midstream with one easy transformer!
● Can also connect to Excel and CSV
● Select which attributes you want to join
● Explicit cardinality parameters
● Very efficient, especially when using a
primed cache
○ Automatically done for File-based
joins
FeatureReader
● Can be done midstream which aids in
adding and joining data quickly
● Limited options: Inner Joins only using
WHERE clause
● Does lots of work per input feature
○ Only efficient when 1 input results in
large numbers of output
Merging
Attributes of
Multiple
Datasets
InlineQuerier
● Data already loaded, any data source can
be used
● Uses SQL
● Join data in a temporary SQLite database
● Multiple queries in a single transformer
● Great if you 🩷 SQL
FeatureJoiner
● Data already loaded, any data source can
be used
● Uses SQL-Free joins
● Easy, approachable
● Great performance
InlineQuerier
Move over FeatureJoiner:
● Use any data source
● Utilizes the power of SQL
● Multiple queries
● Join data in a temporary
SQLite database
● Includes spatial
SQLite Tutorial - An Easy Way to Master SQLite Fast
Demo
InlineQuerier
Tips for InlineQuerier
● Loads a temporary SQLite database - so consolidate all your queries if you can
○ Takes some time to load
○ Fast once it’s loaded
● Don’t use it for simple tasks, i.e. Filtering
● Use smart configuration to improve
performance
○ Only define input columns needed
in the queries
Jovita
InlineQuerier
● Any data source can be
used
● Utilizes the power of
SQL
● Join data in a temporary
SQLite database
● Multiple queries
SQLExecutor
● Used to execute SQL
against a database
● Requires an incoming
feature to trigger the SQL
statement
SQLCreator
● Used to execute SQL
against a database
● No incoming feature
required
SQLExecutor & SQLCreator
Move over InlineQuerier
● Needs a database
● Utilizes the power of SQL
● Includes spatial
● You can use any SQL -
○ Select data
○ Create indices
○ Drop tables
You don’t have so just read data
LetYour Database Do The Work
Demo
SQLExecutor
Tips for SQLExecutor or SQLCreator
● Let your database do the work
○ You only load the data you need
● Needs all of your data is in a single database
● SQL - any level of SQL complexity
● Expose attributes if you need them in the workbench
Automating
with FME Flow
Automating
Automate any of your workflows
with apps on FME Flow.
Resources:
Getting Started with FME Flow
Getting Started with FME Flow
Workspace Apps
Getting Started with Gallery Apps
Webinar - Creating No Code Web Apps
with FME Flow
Conclusion
By choosing the right
Transformer, you can merge
and join data more
efficiently.
Summary
Data used in the demos are from the City of Surrey Open Data Site
● FeatureJoiner or InlineQuerier for disparate datasets
● SQLExecutor or SQLCreator if working with databases
● SQL options for SQL experts
● DatabaseJoiner for lookup tables
● Use what you’re comfortable with!
Resources
Resources
● [Knowledge Base] Merging or Joining Spreadsheet or Database Data
● FeatureJoiner Documentation: Choosing a Feature Joining Method section
● [Article] Using the InlineQuerier as a replacement for multiple FeatureMergers
● [Webinar] Strategies for Detecting and Handling Schema Drift
● More at support.safe.com!
Get our Ebook
Spatial Data for the
Enterprise
fme.ly/gzc
Guided learning
experiences at your
fingertips
academy.safe.com
FME Academy
Resources
Check out how-to’s &
demos in the knowledge
base
support.safe.com
Knowledge Base Webinars
Upcoming &
on-demand webinars
safe.com/webinars
Check out
our podcasts
on-demand.
featuring special guest
speakers over at EM360
Resources
Next Steps
We’d love to help you get
started.
Get in touch with us at
info@safe.com
Experience the
FME Accelerator
Contact Us
A world where data is not just a
commodity but a catalyst for
real change.
fme.safe.com/accelerator
Next Steps
ClaimYour Community Badge &
Dive into the new Community!
● Get community badges for watching
webinars
● community.safe.com
● Today’s code: 73PGGG
Join the Community today!
Next Steps
Q&A
ThankYou
Recap of Next Steps
1 Join the FME Community
2 Contact us
3 Experience the FME Accelerator
Please fill out our
webinar survey

More Related Content

Data Integration Basics: Merging & Joining Data

  • 2. Jovita Chan Technical Support Specialist Safe Software Inc. Sara Mak Technical Support Specialist Safe Software Inc.
  • 3. Welcome to Livestorm. A few ways to engage with us during the webinar: Audio issues? Click this for 4 simple troubleshooting steps.
  • 4. Data silos are often a result of multiple data formats. Just because you have different kinds of data, doesn’t mean they can’t be brought together.
  • 5. Poll: What are your biggest challenge when it comes to merging and joining data?
  • 6. Merging and joining disparate datasets can be streamlined! FME transformers can do the work for you.
  • 7. But there are many paths you can take in FME.
  • 8. Agenda 1 Introduction to Merging & Joining FeatureJoiner & FeatureMerger DatabaseJoiner 2 Merging Attributes of Multiple Datasets InlineQuerier SQLExecutor & SQLCreator 3 Resources & Next Steps 4 Q&A Agenda
  • 10. One platform, two technologies FME Form FME Flow Data Movement and transformations (“ETL”) workflows are built here. Brings life to FME Form workflows FME Flow Hosted Safe Software managed FME Flow fme.safe.com/platform FME Enterprise Integration Platform Safe & FME
  • 11. Number of supported data types in FME 1995 2000 2005 2010 2015 2020 2023… 10 100 300 500 GIS CAD Database XML Raster 3D BIM Web Point Cloud Cloud Big Data IOT Gaming BI Indoor Mapping AR/VR Generative AI Cloud Native Tabular Merge and Join ANY Type of Data
  • 12. BIM and Facilities Management Integration IBM Shared Parameters: ● IBM.Name ● IBM.Description ● ... Tririga BIM Integrator ArcGIS Indoors
  • 14. Merging or Joining Data in FME When data is merged, the attributes from each dataset are combined together based on a common identifier and then output as a single fused dataset. This is different from appending data.
  • 15. Terminology Append/Union Merge/Join Easy Peasy in FME! Todayʼs Focus Combine Vertically Combine Horizontally Append Merge
  • 16. Terminology Append/Union Merge/Join Easy Peasy in FME! Todayʼs Focus Combine Vertically Combine Horizontally Append Merge
  • 17. Selecting a Transformer Here are some of our favourites joining data…
  • 18. The Two Categories: SQL and SQL-Free SQL Transformers for Joining & Merging Require knowledge of SQL. Includes: ● InlineQuerier ● SQLCreator ● SQLExecutor SQL-Free Transformers for Joining & Merging Same functionality of SQL transformers. Can be set up easily without any database knowledge. Includes: ● FeatureJoiner ● FeatureMerger ● FeatureReader ● DatabaseJoiner
  • 19. Follow the Flow Chart 1) Is all of my data already inside the workspace? ● If yes, follow the INTERNAL path ● If no, follow the EXTERNAL path Inspect your data and ask yourself: FME Community: Merging or Joining Spreadsheet or Database Data
  • 20. Follow the Flow Chart 1) Is all of my data already inside the workspace? ● If yes, follow the INTERNAL path ● If no, follow the EXTERNAL path Inspect your data and ask yourself: 2) Does the FeatureJoiner for internal or DatabaseJoiner for external work for my data? ● If yes, Great! End here. ● If no, continue to question 3 FME Community: Merging or Joining Spreadsheet or Database Data
  • 21. Follow the Flow Chart 1) Is all of my data already inside the workspace? ● If yes, follow the INTERNAL path ● If no, follow the EXTERNAL path Inspect your data and ask yourself: 2) Does the FeatureJoiner for internal or DatabaseJoiner for external work for my data? ● If yes, Great! End here. ● If no, continue to question 3 3) Do I know or want to use SQL? ● If yes, see the transformers in the green SQL box ● If no, see the transformers in the blue box FME Community: Merging or Joining Spreadsheet or Database Data
  • 24. FeatureMerger Results in a single match by default For 1:M, generate a list of matches for each Requestor by enabling: ● Process Duplicate Suppliers ● Generate List Tutorial: Working with List Attributes Tutorial Documentation
  • 25. FeatureJoiner ● Simpler, and performs faster than FeatureMerger ● SQL-like terminology ● Handles cardinalities easier than the FeatureMerger (ie. 1:M, M:N and M:1) ● Supports multiple matches per source (by default, 1:M results in a record for each match) FeatureMerger ● Handling 1:M joins is a bit more work ● Default: Single match is output for 1:M relationship ● Many Suppliers can be merged on to a single Requestor. ● Generate lists of matches ● Can create geometries!
  • 27. Midstream Join with DatabaseJoiner Provides the ability to form a join against a database (incl. CSV & Excel) with an existing dataset in the workspace. Read in database formats, Excel or CSV Matching and accumulation options Let the database do the work by pre-filtering a subset of the table you’re reading into the workspace Tutorial Documentation
  • 28. DatabaseJoiner ● Non-Blocking - Joins data as each row flows by ● Connect to and join to database tables midstream with one easy transformer! ● Can also connect to Excel and CSV ● Select which attributes you want to join ● Explicit cardinality parameters ● Good for quick lookups - caches first 5000 records FeatureMerger/Joiner ● Blocking - Wait until all data has been read before doing processing ● All FME readers can be used ○ No need for anything to be in a database ● Cardinality options vary
  • 29. DatabaseJoiner ● Non-Blocking - Joins data as each row flows by ● Data is external to the workspace and can be changed without editing the workspace ● Can merge in multiple attributes ● Explicit cardinality parameters ● Best if the join data is larger or changing AttributeValueMapper ● Non-Blocking - Joins data as each row flows by ● No external dependencies ○ Data to lookup is stored in the workspace ○ Can be imported at design time from an external source ● Only a single attribute can be added ● 1:1 lookups only ● Very fast ● Best for small, simple, stable mappings
  • 30. Midstream Join with FeatureReader ● Simpler midstream option ● Join against any database or spatial format ● Typically used for spatial joins (more about spatial joins in Part II) ● Also used to perform tabular joins with database tables
  • 31. Midstream Join with FeatureReader 1. Set the WHERE Clause to: “<Incoming Attribute>” = ʻ@Value(<Initiator Attribute>)ʼ 2. Set Accumulation Mode to Merge Initiator and Result Documentation
  • 32. DatabaseJoiner ● Connect to and join to database tables midstream with one easy transformer! ● Can also connect to Excel and CSV ● Select which attributes you want to join ● Explicit cardinality parameters ● Very efficient, especially when using a primed cache ○ Automatically done for File-based joins FeatureReader ● Can be done midstream which aids in adding and joining data quickly ● Limited options: Inner Joins only using WHERE clause ● Does lots of work per input feature ○ Only efficient when 1 input results in large numbers of output
  • 34. InlineQuerier ● Data already loaded, any data source can be used ● Uses SQL ● Join data in a temporary SQLite database ● Multiple queries in a single transformer ● Great if you 🩷 SQL FeatureJoiner ● Data already loaded, any data source can be used ● Uses SQL-Free joins ● Easy, approachable ● Great performance
  • 35. InlineQuerier Move over FeatureJoiner: ● Use any data source ● Utilizes the power of SQL ● Multiple queries ● Join data in a temporary SQLite database ● Includes spatial SQLite Tutorial - An Easy Way to Master SQLite Fast
  • 37. Tips for InlineQuerier ● Loads a temporary SQLite database - so consolidate all your queries if you can ○ Takes some time to load ○ Fast once it’s loaded ● Don’t use it for simple tasks, i.e. Filtering ● Use smart configuration to improve performance ○ Only define input columns needed in the queries Jovita
  • 38. InlineQuerier ● Any data source can be used ● Utilizes the power of SQL ● Join data in a temporary SQLite database ● Multiple queries SQLExecutor ● Used to execute SQL against a database ● Requires an incoming feature to trigger the SQL statement SQLCreator ● Used to execute SQL against a database ● No incoming feature required
  • 39. SQLExecutor & SQLCreator Move over InlineQuerier ● Needs a database ● Utilizes the power of SQL ● Includes spatial ● You can use any SQL - ○ Select data ○ Create indices ○ Drop tables You don’t have so just read data LetYour Database Do The Work
  • 41. Tips for SQLExecutor or SQLCreator ● Let your database do the work ○ You only load the data you need ● Needs all of your data is in a single database ● SQL - any level of SQL complexity ● Expose attributes if you need them in the workbench
  • 43. Automating Automate any of your workflows with apps on FME Flow. Resources: Getting Started with FME Flow Getting Started with FME Flow Workspace Apps Getting Started with Gallery Apps Webinar - Creating No Code Web Apps with FME Flow
  • 45. By choosing the right Transformer, you can merge and join data more efficiently.
  • 46. Summary Data used in the demos are from the City of Surrey Open Data Site ● FeatureJoiner or InlineQuerier for disparate datasets ● SQLExecutor or SQLCreator if working with databases ● SQL options for SQL experts ● DatabaseJoiner for lookup tables ● Use what you’re comfortable with!
  • 48. Resources ● [Knowledge Base] Merging or Joining Spreadsheet or Database Data ● FeatureJoiner Documentation: Choosing a Feature Joining Method section ● [Article] Using the InlineQuerier as a replacement for multiple FeatureMergers ● [Webinar] Strategies for Detecting and Handling Schema Drift ● More at support.safe.com!
  • 49. Get our Ebook Spatial Data for the Enterprise fme.ly/gzc Guided learning experiences at your fingertips academy.safe.com FME Academy Resources Check out how-to’s & demos in the knowledge base support.safe.com Knowledge Base Webinars Upcoming & on-demand webinars safe.com/webinars
  • 50. Check out our podcasts on-demand. featuring special guest speakers over at EM360 Resources
  • 52. We’d love to help you get started. Get in touch with us at info@safe.com Experience the FME Accelerator Contact Us A world where data is not just a commodity but a catalyst for real change. fme.safe.com/accelerator Next Steps
  • 53. ClaimYour Community Badge & Dive into the new Community! ● Get community badges for watching webinars ● community.safe.com ● Today’s code: 73PGGG Join the Community today! Next Steps
  • 54. Q&A
  • 55. ThankYou Recap of Next Steps 1 Join the FME Community 2 Contact us 3 Experience the FME Accelerator Please fill out our webinar survey