ogriselOlivier Grisel
@atpassos_ml @mathieuen you can use a moving window for text tokens (cross product only inside 100 consecutive words). Mahout can do this.
atpassos_mlAlexandre Passos
@ogrisel @mathieuen I don't think statistical tests are a good idea, we should be able to do this as fast as lexing and trust regularization
atpassos_mlAlexandre Passos
@ogrisel @mathieuen just, when adding a new feature, go through every nonzero feature in the vector and continue the hash from there
atpassos_mlAlexandre Passos
@ogrisel this should have lots of collisions (and can't handle redundant hashes as-is), but should be fast to get n-th order cross-products
mikentweetsMichael Nute
@mathieuen how are you using the cross product on features? And what's the hash trick? So many questions and so few characters... #stats