Change Log

All notable changes to PySS3 will be documented here.

[0.6.4] 2021-01-30

Fixed

Quick fix of default compatibility with foreign languages (#15).

[0.6.3] 2020-07-17

Fixed

Patches issue #11.

[0.6.1] 2020-05-26

Added

Dataset.load_from_files_multilabel() can load documents with no labels as well (31251f8).
A set_testset_from_files_multilabel() function was added to the Live_Test class. This function allows loading multilabel datasets from disk Live Test server (0ddbd6a).

Fixed

Fixed a bug in SS3 hyperparameter initialization (e2e72f9).

[0.6.0] 2020-05-24

Added

PySS3 now fully support multi-label classification! :)

The load_from_files_multilabel() function was added to the Dataset class (7ece7ce, resolved #6)
The Evaluation class now supports multi-label classification (resolved #5)
- Add multi-label support to train()/fit() (4d00476)
- Add multi-label support to Evaluation.test() (0a897dd)
- Add multi-label support to show_best and get_best() (ef2419b)
- Add multi-label support to kfold_cross_validation() (aacd3a0)
- Add multi-label support to grid_search() (925156d, 79f1e9d)
- Add multi-label support to the 3D Evaluation Plot (42bbc65)
The Live Test tool now supports multi-label classification as well (15657ee, b617bb7, resolved #9)
Category names are no longer case-insensitive (4ec009a, resolved #8)

[0.5.7] 2020-05-05

Added

The Live Test Tool now supports custom (user-defined) preprosessing methods (b50cfaf, 7c6b0c6, resolved #3).
The tokenization process was improved (26fff88, 4af8e80).
The process for recognizing word n-grams during classification was improved (2ceb148).

[0.5.5] 2020-03-02

Added

The predict method was optimized. Now it is 10x to 200x faster! This improvement also has a positive impact on other methods that use predict such as grid_search (37202d8).
A new get_ngrams_length method was added to SS3 class. It can be used to get the length of longest learned n-gram (b4f8827).
The Evaluation 3D Plot’s GUI was improved (1bb1e5a).

Fixed

Some bugs and error were fixed (bc5c4ed, 0d3d7e1, 86a0189, b0b3eaa, 5dbdc3a)

[0.5.0] 2020-02-24

Added

A new Evaluation class to pyss3.util (8feeef5): Now the user can import the Evaluation class to perform model evaluation and hyperparameter optimization. This class not only provide methods to evaluate models but also keeps all the advantages previously provided only through the Command Line tool, such as an evaluation cache that automatically keeps track of the evaluation history and the generation of the interactive 3D evaluation plot.
set_name() to SS3 (5b1c355).
train() to SS3 as a user-friendly alias of fit() (74cb540).
Print now supports nested verbosity regions (78176ab).

Fixed

Compatibility of progress bars with Jupyter Notebooks (7848b3e, 8d163d9, 2029c37, 2a700d5).
Bug in SS3.fit when given an empty document (31eccbc).
Non-string category labels support (5b1c355).
Issue with verbosity level consistency (b38d8b0).
IndexError in classify_(multi)label (fa91952).
Python 2 UnicodeEncodeError issue (867026e).

[0.4.1] 2020-02-16

Added

Public methods for the SS3’s cv, gv, lv, sg and sn functions have been added to the SS3 class (ef35b25). These functions were originally defined in Section 3.2.2 of the original paper.
Slightly improving training time (due to previously disabled ‘by-default’ cache of “local value” function).

Fix

A bug on the HTTP Live Test Server (d106d68)
Some bug on the Command-Line tool (cd42b61, 8745603, dfe8b95)

[0.4.0] 2020-02-11

Among other minor improvements and changes, the most important ones that were added are:

Added

SS3 class: - The classifier now explicitly supports multi-label classification:
- Created the following two methods in SS3 class: classify_multilabel() and classify_label() (0759bca).
- A multilabel argument was added to the predict method (c5ac946).
- A new extract_insight() method was added to the SS3 class. This method, given a document, returns the pieces of text that were involved in the classification decision (eee1e29).
- Created four new methods to allow the user to set the delimiters (b632fe0): set_block_delimiters(), set_delimiter_paragraph, set_delimiter_sentence, and set_delimiter_word.
Live Test tool:
- Improved the the interface by which “Live Test” Server was called from source code, now its usage is more user-friendly and less misleading (read 516b526 for more info).
- Improved the way by which multi-label classification was carried out in the Web interface (046f9f4).
Improved how PySS3 handles verbosity levels (read 216be41 for more info ): created the set_verbosity() function.

[0.3.9] 2019-11-27

Added

Live Test: layout updated.
PySS3 Command Line: frange function added as an alias of r for the grid_search command.

Fixed

PySS3 Command Line: live_test always lunch the server with no documents (even when before “live_test a/path”)
Live Test:sentences starting with “unknown” token were not included in the “Advanced” interactive chart

[0.3.8] 2019-11-25

Fixed

Server: fixed bug that stopped the server when receiving arbitrary bytes (not utf-8 strings)
PySS3 Command Line: fixed bug when loading live_test with a non existing path
Live Test: now the user can select one-letter words (and are also included in the “advanced” live chart)

[0.3.7] 2019-11-22

Added

Summary operators are not longer static.
Server.set_testset_from_files lazy load.

Fixed

Evaluation plot: confusion matrices size when working with k-folds

[0.3.6] 2019-11-14

Added

Dataset class added to pyss3.util as an interface to help the user to load/read datasets. Method Dataset.load_from_files added
Documentations updated

[0.3.5] 2019-11-12

Added

PySS3 Command Line Python 2 full compatibility support

Fixed

Matplotlib set_yaxis bug fixed

[0.3.4] 2019-11-12

Fixed

Dependencies and compatibility with python 2 Improved

[0.3.3] 2019-11-12

Fixed

Setup and tests fixed

[0.3.2] 2019-11-12

Added

Summary operators: now it is possible to use user-defined summary operators, the following static methods were added to the SS3 class: summary_op_ngrams, summary_op_sentences, and summary_op_paragraphs.

[0.3.1] 2019-11-11

Added

update: some docstrings were improved
update: the README.md / Pypi Description file.

Fixed

Python 2 and 3 compatibility problem with scikit-learn (using version 0.20.1 from now on)
PyPi: setup.py: long_description_content_type set to ‘text/markdown’