[Predator] Use tf-idf method to convert crash reports to text vectors Find a good way to represent a crash using tf-idf method.
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/f75350b875f92157d0c1fed4715572ae3b88c723 commit f75350b875f92157d0c1fed4715572ae3b88c723 Author: Sharu Jiang <katesonia@google.com> Date: Fri May 26 18:51:42 2017 [Predator] Add KeywordExtractor to extract keywords from CrashReport. TBR=stgao@chromium.org Bug: 722388 Change-Id: I336f906e5980272c66e9162934089a77a7a6cdeb Reviewed-on: https://chromium-review.googlesource.com/505862 Reviewed-by: Sharu Jiang <katesonia@chromium.org> Commit-Queue: Sharu Jiang <katesonia@chromium.org> [add] https://crrev.com/f75350b875f92157d0c1fed4715572ae3b88c723/appengine/predator/analysis/keyword_extractor.py [add] https://crrev.com/f75350b875f92157d0c1fed4715572ae3b88c723/appengine/predator/analysis/test/keyword_extractor_test.py
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/cdd349375f630835b9dd2fd27497278925167bbd commit cdd349375f630835b9dd2fd27497278925167bbd Author: Sharu Jiang <katesonia@google.com> Date: Tue May 30 17:34:20 2017 [Predator] Add ``InvertedIndex`` model to compute IDF of keywords of a crash. Doc about IDF feature: https://docs.google.com/a/google.com/document/d/1HFOQboYFPNB0VVduflrI6LJC01ordpCUVjkU4Z6aQjQ/edit?usp=sharing TBR=stgao@chromium.org Bug: 722388 Change-Id: Ie38508fc64478c60189e2e0d4713891819f2f2f9 Reviewed-on: https://chromium-review.googlesource.com/505994 Reviewed-by: Sharu Jiang <katesonia@chromium.org> Commit-Queue: Sharu Jiang <katesonia@chromium.org> [add] https://crrev.com/cdd349375f630835b9dd2fd27497278925167bbd/appengine/predator/common/model/test/inverted_index_test.py [add] https://crrev.com/cdd349375f630835b9dd2fd27497278925167bbd/appengine/predator/common/model/inverted_index.py
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/aa9db06bc880ca586b7db2742bf56f5d1bb8a8d5 commit aa9db06bc880ca586b7db2742bf56f5d1bb8a8d5 Author: Sharu Jiang <katesonia@google.com> Date: Wed Jul 12 23:50:17 2017 [Predator] Add FilePathIdfFeature. Design doc: https://docs.google.com/a/google.com/document/d/1HFOQboYFPNB0VVduflrI6LJC01ordpCUVjkU4Z6aQjQ/edit?usp=sharing Bug: 722388 Change-Id: Id28a374a2c5f48e6e4853468d5d61536b255a4ae Reviewed-on: https://chromium-review.googlesource.com/517462 Commit-Queue: Sharu Jiang <katesonia@chromium.org> Reviewed-by: Jeffrey Li <lijeffrey@chromium.org> [add] https://crrev.com/aa9db06bc880ca586b7db2742bf56f5d1bb8a8d5/appengine/predator/analysis/linear/changelist_features/file_path_idf.py [add] https://crrev.com/aa9db06bc880ca586b7db2742bf56f5d1bb8a8d5/appengine/predator/analysis/linear/changelist_features/test/file_path_idf_test.py [modify] https://crrev.com/aa9db06bc880ca586b7db2742bf56f5d1bb8a8d5/appengine/predator/common/predator_for_chromecrash.py [modify] https://crrev.com/aa9db06bc880ca586b7db2742bf56f5d1bb8a8d5/appengine/predator/analysis/linear/model.py
Comment 1 by kateso...@chromium.org
, May 19 2017