Skip to content

Bag of Tricks in Machine Learning

Public Personal Notebook

  • Home
  • random
  • paper
  • list
  • nlp
  • reinforcementlearning
  • gan
  • cv

Month: July 2022

Embedding-based Search Retrieval Papers and Blogs

October 12, 2022July 17, 2022 by admin
  • Alibaba: https://arxiv.org/abs/2106.09297 
  • Alibaba: https://arxiv.org/abs/2210.04170
  • Amazon: https://arxiv.org/abs/1907.00937
  • Amazon: https://arxiv.org/abs/2105.02978
  • Coveo: https://arxiv.org/abs/2104.02061 
  • eBay: https://dl.acm.org/doi/abs/10.1145/3366424.3382715 
  • Facebook: https://arxiv.org/abs/2006.11632
  • Facebook: https://research.facebook.com/publications/que2search-fast-and-accurate-query-and-document-understanding-for-search-at-facebook/ 
  • Google: https://arxiv.org/abs/2010.01195 
  • HomeDepot: https://arxiv.org/abs/2008.08180 
  • Instacart: https://sigir-ecom.github.io/ecom22Papers/paper_8392.pdf
  • JD: https://arxiv.org/abs/2006.02282
  • Pins: https://medium.com/pinterest-engineering/searchsage-learning-search-query-representations-at-pinterest-654f2bb887fc
  • Spotify: https://engineering.atspotify.com/2022/03/introducing-natural-language-search-for-podcast-episodes
  • Walmart: https://dl.acm.org/doi/abs/10.1145/3308560.3316603 
  • Walmart: https://dl.acm.org/doi/abs/10.1145/3534678.3539164
  • Wayfair: https://arxiv.org/abs/2204.05231
Categories nlp, papers Leave a comment

Best practice to follow this website

  1. In Feedly, click “add content”
  2. Input the url “bagoftricks.ml” and follow

Shujian Follow

Software engineer @Google Travel. PhD in renewable energy from @UMassAmherst. @kaggle triple master. IR/NLP/ASR.

Shujian_Liu
Retweet on Twitter Shujian Retweeted
lilianweng Lilian Weng @lilianweng ·
19 Mar

🛠 New posts on Prompt Engineering: Steer a large pretrained language model to do what you want wo/ updating the model weights.

https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/

Most importantly this just introduces general ideas, but for your own problem, you always need tuning and experimentation.

Reply on Twitter 1637490086333001728 Retweet on Twitter 1637490086333001728 266 Like on Twitter 1637490086333001728 1417 Twitter 1637490086333001728
Retweet on Twitter Shujian Retweeted
fchollet François Chollet @fchollet ·
18 Mar

"It's autocomplete" is not a helpful analogy to understand LLMs. A LLM is more like a database that lets query information in natural language. You can query both knowledge, and "patterns" (associative programs seen in the training data, that can be applied to new inputs).

Reply on Twitter 1637121320340299776 Retweet on Twitter 1637121320340299776 157 Like on Twitter 1637121320340299776 1221 Twitter 1637121320340299776
Retweet on Twitter Shujian Retweeted
_jasonwei Jason Wei @_jasonwei ·
13 Mar

Hot take supported by evidence: for a given NLP task, it is unwise to extrapolate performance to larger models because emergence can occur.

I manually examined all 202 tasks in BIG-Bench, and the most common category was for the scaling behavior to *unpredictably* increase.

Reply on Twitter 1635338409370865665 Retweet on Twitter 1635338409370865665 56 Like on Twitter 1635338409370865665 359 Twitter 1635338409370865665
Retweet on Twitter Shujian Retweeted
cosminnegruseri Cosmin Negruseri @cosminnegruseri ·
28 Feb

this slide is great, and focuses on one ranking model + postprocessing, but if your team owns an end to end system with indexing, candidate retrieval, ranking, blending oncall is even more complicated

Reply on Twitter 1630451840503668740 Retweet on Twitter 1630451840503668740 3 Like on Twitter 1630451840503668740 17 Twitter 1630451840503668740
Retweet on Twitter Shujian Retweeted
jobergum Jo Kristian Bergum @jobergum ·
27 Feb

Hm, a ready-to-ship e-commerce search solution with tunable hybrid ranking, auto-complete query suggestions, and query contextualized navigation. Better than any commercial vendor, but with open-source technology, seeing how the sausage is made.

Reply on Twitter 1630275264314744833 Retweet on Twitter 1630275264314744833 7 Like on Twitter 1630275264314744833 81 Twitter 1630275264314744833
Load More

Categories

  • ml (2)
  • nlp (15)
  • papers (4)
  • random (8)
  • reinforcementlearning (1)
  • search (1)
  • Uncategorized (59)

Tags

anomaly (1) automl (1) ctr (1) cv (2) data (1) distributedtraining (1) gan (1) kaggle (1) list (2) ml (2) NER (1) nlp (10) nn (1) paper (1) random (3) reinforcementlearning (1) sql (1)

Recent Posts

  • Google’s Deep Learning Tuning Playbook
  • Few-Shot Learning in NLP
  • (Very) Large Language Models in 2022
  • Airbnb Search Papers
  • Dense Retriever for Salient Phrase

Archives

  • January 2023 (1)
  • October 2022 (1)
  • August 2022 (3)
  • July 2022 (1)
  • July 2021 (2)
  • June 2021 (1)
  • May 2020 (1)
  • October 2019 (1)
  • August 2019 (11)
  • July 2019 (8)
  • June 2019 (6)
  • May 2019 (1)
  • April 2019 (11)
  • March 2019 (4)
  • February 2019 (2)
  • January 2019 (32)

Recent Comments

    July 2022
    M T W T F S S
     123
    45678910
    11121314151617
    18192021222324
    25262728293031
    « Jul   Aug »

    • 0
    • 13
    • 91,001
    • 45,682
    • 86
    • 0

    Recent Posts

    • Google’s Deep Learning Tuning Playbook
    • Few-Shot Learning in NLP
    • (Very) Large Language Models in 2022
    • Airbnb Search Papers
    • Dense Retriever for Salient Phrase

    Categories

    • ml (2)
    • nlp (15)
    • papers (4)
    • random (8)
    • reinforcementlearning (1)
    • search (1)
    • Uncategorized (59)

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    © 2023 Bag of Tricks in Machine Learning • Built with GeneratePress