Daniel Tunkelang

Mountain View, California, United States

38K followers 500+ connections

View mutual connections with Daniel

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to view profile

Self

Massachusetts Institute of Technology

Contact Daniel for services

IT Consulting

About

Search, recommender systems, machine learning / AI, LLMs / RAG, data science. LinkedIn /…

Articles by Daniel

Where Do LTR Labels Come From?

Where Do LTR Labels Come From?

By Daniel Tunkelang

Aug 6, 2024
How to be a Search Consultant

How to be a Search Consultant

By Daniel Tunkelang

Aug 2, 2024
Search and the Art of Conversation

Search and the Art of Conversation

By Daniel Tunkelang

Jul 29, 2024

See all articles

Contributions

Join now to see all contributions

Activity

Honored to participate in Nicolay Christopher Gerold's star-studded podcast series on How AI Is Built. Nicolay and I had a great conversation about…

Honored to participate in Nicolay Christopher Gerold's star-studded podcast series on How AI Is Built. Nicolay and I had a great conversation about…

Shared by Daniel Tunkelang
We just started posting our first Call for Participation for Search Solutions 2024. So if you have not received it yet via your favourite mailing…

We just started posting our first Call for Participation for Search Solutions 2024. So if you have not received it yet via your favourite mailing…

Liked by Daniel Tunkelang
Talking about Learning To Rank (LTR) models, Daniel Tunkelang just wrote a great post about relevance judgements in Search. "The cost of training a…

Talking about Learning To Rank (LTR) models, Daniel Tunkelang just wrote a great post about relevance judgements in Search. "The cost of training a…

Liked by Daniel Tunkelang

Join now to see all activity

Experience & Education

Self

*******

**********
*****

*******
************* ********* ** **********

**, ** ******** *******, ***********

1988 - 1992
******** ****** **********

*** ******** *******

1993 - 1998

View Daniel’s full experience

See their title, tenure and more.

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

Semantic Equivalence of e-Commerce Queries

KDD 2023 Workshop on e-Commerce and NLP (ECNLP) August 7, 2023
Search query variation poses a challenge in e-commerce search, as equivalent search intents can be expressed through different queries with surface-level differences. This paper introduces a framework to recognize and leverage query equivalence to enhance searcher and business outcomes. The proposed approach addresses three key problems: mapping queries to vector representations of search intent, identifying nearest neighbor queries expressing equivalent or similar intent, and optimizing for…

Search query variation poses a challenge in e-commerce search, as equivalent search intents can be expressed through different queries with surface-level differences. This paper introduces a framework to recognize and leverage query equivalence to enhance searcher and business outcomes. The proposed approach addresses three key problems: mapping queries to vector representations of search intent, identifying nearest neighbor queries expressing equivalent or similar intent, and optimizing for user or business objectives. The framework utilizes both surface similarity and behavioral similarity to determine query equivalence. Surface similarity involves canonicalizing queries based on word inflection, word order, compounding, and noise words. Behavioral similarity leverages historical search behavior to generate vector representations of query intent. An offline process is used to train a sentence similarity model, while an online nearest neighbor approach supports processing of unseen queries. Experimental evaluations demonstrate the effectiveness of the proposed approach, outperforming popular sentence transformer models and achieving a Pearson correlation of 0.85 for query similarity. The results highlight the potential of leveraging historical behavior data and training models to recognize and utilize query equivalence in e-commerce search, leading to improved user experiences and business outcomes. Further advancements and benchmark datasets are encouraged to facilitate the development of solutions for this critical problem in the e-commerce domain.

Other authors
See publication
ORDSIM: Ordinal Regression for E-Commerce Query Similarity Prediction

Proceedings of the International Workshop on Interactive and Scalable Information Retrieval methods for eCommerce (ISIR-eCom) 2022 March 13, 2022
Query similarity prediction task is generally solved by regression based models with square loss. Such a model is agnostic of absolute similarity values and it penalizes the regression error at all ranges of similarity values at the same scale. However, to boost e-commerce platform's monetization, it is important to predict high-level similarity more accurately than low-level similarity, as highly similar queries retrieves items according to user-intents, whereas moderately similar item…

Query similarity prediction task is generally solved by regression based models with square loss. Such a model is agnostic of absolute similarity values and it penalizes the regression error at all ranges of similarity values at the same scale. However, to boost e-commerce platform's monetization, it is important to predict high-level similarity more accurately than low-level similarity, as highly similar queries retrieves items according to user-intents, whereas moderately similar item retrieves related items, which may not lead to a purchase. Regression models fail to customize its loss function to concentrate around the high-similarity band, resulting poor performance in query similarity prediction task. We address the above challenge by considering the query prediction as an ordinal regression problem, and thereby propose a model, ORDSIM (ORDinal Regression for SIMilarity Prediction). ORDSIM exploits variable-width buckets to model ordinal loss, which penalizes errors in high-level similarity harshly, and thus enable the regression model to obtain better prediction results for high similarity values. We evaluate ORDSIM on a dataset of over 10 millions e-commerce queries from eBay platform and show that ORDSIM achieves substantially smaller prediction error compared to the competing regression methods on this dataset.

Other authors
See publication
MMM, Search!

Presented to Wikimedia Foundation April 27, 2020

An opinionated talk about search metrics, models, and methods. Presented to the Wikimedia Foundation on April 27, 2020 at the invitation of Wikimedia CTO Grant Ingersoll.

See publication
10 Tips to Optimize Holiday E-Commerce Sales

Total Retail December 10, 2017

See publication
Voice Search must Identify User Intent before it can make Sales

VentureBeat September 29, 2017

See publication
Ten Things Everyone Should Know About Machine Learning

Forbes September 5, 2017

See publication
Query Understanding: A Manifesto

InfoQ January 14, 2017

See publication
Doing Data Science Right - Your Most Common Questions Answered

First Round Review April 5, 2016
In this article, we've summarized the advice we give to founders who are interested in building data science teams. We explain why data science is so important for many startups, when companies should begin investing in it, where to put data science in their organization and how to build a culture where data science thrives.

Other authors
See publication
Where should you put your data scientists?

O'Reilly Media January 7, 2016

See publication
Data Scientists: Generalists or specialists?

O'Reilly Media December 27, 2015

See publication
Beyond the Venn diagram

O'Reilly Media December 3, 2015

See publication
Beyond algorithms: Optimizing the search experience

O'Reilly Media October 13, 2015

See publication
Data Scientists at Work

Apress 2014
Interview published as a book chapter.

Other authors
See publication
The Role of Network Distance in Linkedin People Search

37th Annual International ACM SIGIR Conference (SIGIR 2014) 2014
Accepted as short research paper (peer-reviewed).

Other authors
See publication
Web Science: How is it different?

ACM Web Science 2014 Conference (WebSci 2014) 2014

Keynote Address at ACM Web Science 2014 Conference (WebSci 2014)

See publication
Symposium on Human–Computer Information Retrieval

Journal of Big Data Mar 2013
Meeting report.

Other authors
See publication
Find and be Found: Information Retrieval at LinkedIn

36th Annual International ACM SIGIR Conference (SIGIR 2013) 2013
Invited talk at the 36th Annual International ACM SIGIR Conference (SIGIR 2013)

Other authors
See publication
Introduction to Special Issue on Human-Computer Information Retrieval

Journal of Information Processing & Management 2013
Editor's introduction for special issue on HCIR.

Other authors
See publication
Content, Connections, and Context

6th ACM International Conference on Recommender Systems (RecSys 2012) 2012

Keynote at Workshop on Recommender Systems and the Social Web (RSWeb 2012)

See publication
Data By The People, For The People

21st ACM International Conference on Information and Knowledge Management (CIKM 2012) 2012

Invited Talk at the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012)

See publication
HCIR 2011: the Fifth International Workshop on Human-Computer Interaction and Information Retrieval

SIGIR Forum 45(2) 2011
Workshop report.

Other authors
See publication
Recommendations as a Conversation with the User

5th ACM International Conference on Recommender Systems (RecSys 2011) 2011

Tutorial exploring role of recommendations as part of a conversation between user and information-seeking system.

See publication
Social Navigation: A Position Paper

HCIR 2011

In this position paper, we propose social navigation as a paradigm
for information access. We define social navigation as navigation
through explicit manipulation of a social lens and offer examples
of its application.

See publication
HCIR 2010: the Fourth International Workshop on Human-Computer Interaction and Information Retrieval

SIGIR Forum 44(2) 2010
Workshop report.

Other authors
See publication
Design for Interaction

ACM SIGMOD/PODS Conference 2009

Special invited session on Human-Computer Interaction with Information.

See publication
Faceted Search

Morgan & Claypool 2009

In Synthesis Lectures on Information Concepts, Retrieval, and Services, edited by Gary Marchionini of the University of North Carolina.

See publication
HCIR 2009: the Third International Workshop on Human-Computer Interaction and Information Retrieval

SIGIR Forum 43(2) 2009
Workshop report.

Other authors
See publication
Resolving the Battle Royale between Information Retrieval and Information Science

Information Seeking Support Systems Workshop 2008

Position paper at invitational workshop sponsored by the National Science Foundation (NSF).

See publication
Enterprise Information Access and the User Experience

IT Professional 9(1) 2007
Explanation of enterprise information access framework IT Professional is published by the IEEE.

Other authors
See publication
Information Access and the User Experience

30th Annual International ACM SIGIR Conference (SIGIR 2007) 2007

Invited presentation at Industry Event.

See publication
Dynamic Category Sets: An Approach for Faceted Search

ACM SIGIR Workshop on Faceted Search 2006

Novel approach that addresses the vocabulary problem for faceted data.

See publication
Processing Search Queries in a Distributed Environment

ACM Thirteenth Conference on Information and Knowledge Management (CIKM 2004) 2004
Presented as a poster.

Other authors
See publication
Making the Nearest Neighbor Meaningful

Workshop on Clustering High Dimensional Data and its Applications at Second SIAM International Conference on Data Mining (SDM 2002) 2002

Proposes new data-driven difference measure for categorical data.

See publication
JIGGLE: Java Interactive Graph Layout Environment

6th Annual Symposium on Graph Drawing (GD '98) 1998

Java-based platform for experimenting with numerical optimization approaches to general graph layout.

See publication
Lexical Navigation: Using Incremental Graph Drawing for Query Refinement

5th Annual Symposium on Graph Drawing (GD '97) 1997
Work done at IBM T. J. Watson Research Center.

Other authors
See publication

Patents

Query-by-Example for Finding Similar People

Issued June 30, 2020 US 10,698,914
Other inventors
See patent
Query-By-Example for Finding Similar People

Issued June 30, 2020 US 10,698,914
Other inventors
See patent
Guided Search

Issued June 23, 2020 US 10,691,760
Other inventors
See patent
Leveraging A Social Graph For Use With Electronic Messaging

Issued May 15, 2018 US 9,971,993
Other inventors
See patent
Method and System for Semantic Search against a Document Collection

Issued July 18, 2017 US 9,710,518
Other inventors
See patent
Query-by-Example for Finding Similar People

Issued March 30, 2016 US 15/085,516
Other inventors
See patent
Query-by-Example for Finding Similar People

Filed March 30, 2016 US 15/085,516
Other inventors
See patent
Method and System for Semantic Search against a Document Collection

Issued August 25, 2015 US 9,116,948
Other inventors
See patent
System and Method for Measuring the Quality of Document Sets

Issued October 28, 2014 US 8,874,549
Other inventors
See patent
Interactive Construction of Queries

Filed September 29, 2014 US 14/500,633
Other inventors
See patent
Generating Suggested Structured Queries

Filed September 29, 2014 US 14/500,693
Other inventors
See patent
System and Method for Determining Users Working for the Same Employers in a Social Network

Issued September 9, 2014 US 8,831,969
Other inventors
See patent
Presenting Suggested Facets

Filed July 23, 2014 US 14/339,300
Other inventors
See patent
Method and System for Information Retrieval with Clustering

Issued March 18, 2014 US 8,676,802
Other inventors
See patent
System and Method for Concept Visualization

Issued September 3, 2013 US 8,527,515
Other inventors
See patent
Method and System for Semantic Search Against a Document Collection

Issued June 25, 2013 US 8,473,503
Other inventors
See patent
System and Method for Measuring the Quality of Document Sets

Issued July 10, 2012 US 8,219,593
Other inventors
See patent
System and Method for Measuring the Quality of Document Sets

Issued November 1, 2011 US 8,051,073
Other inventors
See patent
System and Method for Measuring the Quality of Document Sets

Issued November 1, 2011 US 8,051,084
Other inventors
See patent
System and Method for Measuring the Quality of Document Sets

Issued September 20, 2011 US 8,024,327
Other inventors
See patent
System and Method for Information Retrieval from Object Collections with Complex Interrelationships

Issued September 13, 2011 US 8,019,752
Other inventors
See patent
System and Method for Measuring the Quality of Document Sets

Issued August 23, 2011 US 8,005,643
Other inventors
See patent
Hierarchical Data-Driven Navigation System and Method for Information Retrieval

Issued March 22, 2011 US 7,912,823
Other inventors
See patent
Scalable Hierarchical Data-Driven Navigation System and Method for Information Retrieval

Issued November 10, 2009 US 7,617,184
Other inventors
See patent
Hierarchical Data-Driven Search and Navigation System and Method for Information Retrieval

Issued July 28, 2009 US 7,567,957
Other inventors
See patent
Integrated Application for Manipulating Content in a Hierarchical Data-Driven Search and Navigation System

Issued September 23, 2008 US 7,428,528
Other inventors
See patent
System and Method for Manipulating Content in a Hierarchical Data-Driven Search and Navigation System

Issued January 29, 2008 US 7,325,201
Other inventors
See patent
Hierarchical Data-Driven Search and Navigation System and Method for Information Retrieval

Issued June 13, 2006 US 7,062,483
Other inventors
See patent
Hierarchical Data-Driven Navigation System and Method for Information Retrieval

Issued April 25, 2006 US 7,035,864
Other inventors
See patent
Applying Numerical Approximation to General Graph Drawing

Issued November 30, 1999 US 5,995,114
Other inventors
See patent
Multimedia Document using Time Box Diagrams

Issued February 10, 1998 US 5,717,438
Other inventors
See patent
Item Retrieval Using Query Core Intent Detection

Filed June 14, 2022 US 17/840,337
Other inventors
See patent
Search Result Identification Using Vector Aggregation

Filed December 31, 2021 US 17/646,695
Other inventors
See patent
Guided Search

Filed December 23, 2014 US 14/582,o65
Other inventors
See patent
Guided Search

Filed December 23, 2014 US 14/582,065
Other inventors
See patent
Generating Suggested Structured Queries

Filed September 29, 2014 US 14/500,693
Other inventors
See patent
Translating a Keyword Search into a Structured Query

Filed September 29, 2014 US 14/500,545
Other inventors
See patent
Interactive Construction of Queries

Filed September 29, 2014 US 14/500,633
Other inventors
See patent
Identifying Members of a Small and Medium Business Segment

Filed June 27, 2014 US 14/318,326
Other inventors
See patent
Techniques For Identifying And Presenting Connection Paths

Filed July 13, 2012 US 13/548,957
Other inventors
See patent
Techniques For Identifying And Presenting Connection Paths

Filed May 19, 2012 US 13/482,884
Other inventors
See patent

Languages

Spanish

-
French

-
Italian

-
Portuguese

-
German

-

Recommendations received

9 people have recommended Daniel

Join now to view

More activity by Daniel

The most common goal that my search clients express is a desire to improve their ranking. This post sketches out some challenges of obtaining labeled…

The most common goal that my search clients express is a desire to improve their ranking. This post sketches out some challenges of obtaining labeled…

Shared by Daniel Tunkelang
Master at great level.

Master at great level.

Liked by Daniel Tunkelang

View Daniel’s full profile

See who you know in common
Get introduced
Contact Daniel directly

Join to view full profile

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses

See all courses

Daniel Tunkelang

Mountain View, California, United States 38K followers 500+ connections

Contact Daniel for services

IT Consulting

About

Articles by Daniel

Where Do LTR Labels Come From?

By Daniel Tunkelang

How to be a Search Consultant

By Daniel Tunkelang

Search and the Art of Conversation

By Daniel Tunkelang

Contributions

Activity

Honored to participate in Nicolay Christopher Gerold's star-studded podcast series on How AI Is Built. Nicolay and I had a great conversation about…

Shared by Daniel Tunkelang

We just started posting our first Call for Participation for Search Solutions 2024. So if you have not received it yet via your favourite mailing…

Liked by Daniel Tunkelang

Talking about Learning To Rank (LTR) models, Daniel Tunkelang just wrote a great post about relevance judgements in Search. "The cost of training a…

Liked by Daniel Tunkelang

Experience & Education

Self

****-***** **********

View Daniel’s full experience

See their title, tenure and more.

Publications

KDD 2023 Workshop on e-Commerce and NLP (ECNLP) August 7, 2023

Proceedings of the International Workshop on Interactive and Scalable Information Retrieval methods for eCommerce (ISIR-eCom) 2022 March 13, 2022

Presented to Wikimedia Foundation April 27, 2020

Total Retail December 10, 2017

VentureBeat September 29, 2017

Forbes September 5, 2017

InfoQ January 14, 2017

First Round Review April 5, 2016

O'Reilly Media January 7, 2016

O'Reilly Media December 27, 2015

O'Reilly Media December 3, 2015

O'Reilly Media October 13, 2015

Apress 2014

37th Annual International ACM SIGIR Conference (SIGIR 2014) 2014

ACM Web Science 2014 Conference (WebSci 2014) 2014

Journal of Big Data Mar 2013

36th Annual International ACM SIGIR Conference (SIGIR 2013) 2013

Journal of Information Processing & Management 2013

6th ACM International Conference on Recommender Systems (RecSys 2012) 2012

21st ACM International Conference on Information and Knowledge Management (CIKM 2012) 2012

SIGIR Forum 45(2) 2011

5th ACM International Conference on Recommender Systems (RecSys 2011) 2011

HCIR 2011

SIGIR Forum 44(2) 2010

ACM SIGMOD/PODS Conference 2009

Morgan & Claypool 2009

SIGIR Forum 43(2) 2009

Information Seeking Support Systems Workshop 2008

IT Professional 9(1) 2007

30th Annual International ACM SIGIR Conference (SIGIR 2007) 2007

ACM SIGIR Workshop on Faceted Search 2006

ACM Thirteenth Conference on Information and Knowledge Management (CIKM 2004) 2004

Workshop on Clustering High Dimensional Data and its Applications at Second SIAM International Conference on Data Mining (SDM 2002) 2002

6th Annual Symposium on Graph Drawing (GD '98) 1998

5th Annual Symposium on Graph Drawing (GD '97) 1997

Patents

Issued June 30, 2020 US 10,698,914

Issued June 30, 2020 US 10,698,914

Issued June 23, 2020 US 10,691,760

Issued May 15, 2018 US 9,971,993

Issued July 18, 2017 US 9,710,518

Issued March 30, 2016 US 15/085,516

Filed March 30, 2016 US 15/085,516

Issued August 25, 2015 US 9,116,948

Issued October 28, 2014 US 8,874,549

Filed September 29, 2014 US 14/500,633

Filed September 29, 2014 US 14/500,693

Issued September 9, 2014 US 8,831,969

Filed July 23, 2014 US 14/339,300

Issued March 18, 2014 US 8,676,802

Issued September 3, 2013 US 8,527,515

Issued June 25, 2013 US 8,473,503

Mountain View, California, United States

38K followers 500+ connections