soft-shell crabvietnam crab exporter

Simon Willison’s Weblog

Subscribe

Sunday, 27th June 2021

Group thousands of similar spreadsheet text cells in seconds (via) Luke Whyte explains how to efficiently group similar text columns in a table (Walmart and Wal-mart for example) using a clever combination of TF/IDF, sparse matrices and cosine similarity. Includes the clearest explanation of cosine similarity for text I’ve seen—and Luke wrote a Python library, textpack, that implements the described pattern.

# 4:24 pm / python, data-science

Friday, 25th June 2021
Monday, 28th June 2021

2021 » June

MTWTFSS
 123456
78910111213
14151617181920
21222324252627
282930