Metadata-Version: 2.1
Name: liblinear-official
Version: 2.43.0
Summary: Python binding of LIBLINEAR
Home-page: https://www.csie.ntu.edu.tw/~cjlin/liblinear
Author: ML group @ National Taiwan University
Author-email: cjlin@csie.ntu.edu.tw
License: UNKNOWN
Description: -------------------------------------
        --- Python interface of LIBLINEAR ---
        -------------------------------------
        
        Table of Contents
        =================
        
        - Introduction
        - Installation via PyPI
        - Installation via Sources
        - Quick Start
        - Quick Start with Scipy
        - Design Description
        - Data Structures
        - Utility Functions
        - Additional Information
        
        Introduction
        ============
        
        Python (http://www.python.org/) is a programming language suitable for rapid
        development. This tool provides a simple Python interface to LIBLINEAR, a library
        for support vector machines (http://www.csie.ntu.edu.tw/~cjlin/liblinear). The
        interface is very easy to use as the usage is the same as that of LIBLINEAR. The
        interface is developed with the built-in Python library "ctypes."
        
        Installation via PyPI
        =====================
        
        To install the interface from PyPI, execute the following command:
        
        > pip install -U liblinear-official
        
        Installation via Sources
        ========================
        
        Alternatively, you may install the interface from sources by
        generating the LIBLINEAR shared library.
        
        Depending on your use cases, you can choose between local-directory
        and system-wide installation.
        
        - Local-directory installation:
        
            On Unix systems, type
        
            > make
        
            This generates a .so file in the LIBLINEAR main directory and you
            can run the interface in the current python directory.
            
            For Windows, the shared library liblinear.dll is ready in the
            directory `..\windows' and you can directly run the interface in
            the current python directory. You can copy liblinear.dll to the
            system directory (e.g., `C:\WINDOWS\system32\') to make it
            system-widely available. To regenerate liblinear.dll, please
            follow the instruction of building Windows binaries in LIBLINEAR
            README.
        
        - System-wide installation:
        
            Type
        
            > pip install -e .
        
            Please note that you must keep the sources after the installation.
        
            For Windows, to run the above command, Microsoft Visual C++ and
            other tools are needed.
        
            In addition, DON'T use the following FAILED commands
        
            > python setup.py install (failed to run at the python directory)
            > pip install .
        
        Quick Start
        ===========
        
        "Quick Start with Scipy" is in the next section.
        
        There are two levels of usage. The high-level one uses utility
        functions in liblinearutil.py and commonutil.py (shared with LIBSVM
        and imported by svmutil.py). The usage is the same as the LIBLINEAR
        MATLAB interface.
        
        >>> from liblinear.liblinearutil import *
        # Read data in LIBSVM format
        >>> y, x = svm_read_problem('../heart_scale')
        >>> m = train(y[:200], x[:200], '-c 4')
        >>> p_label, p_acc, p_val = predict(y[200:], x[200:], m)
        
        # Construct problem in python format
        # Dense data
        >>> y, x = [1,-1], [[1,0,1], [-1,0,-1]]
        # Sparse data
        >>> y, x = [1,-1], [{1:1, 3:1}, {1:-1,3:-1}]
        >>> prob  = problem(y, x)
        >>> param = parameter('-s 0 -c 4 -B 1')
        >>> m = train(prob, param)
        
        # Other utility functions
        >>> save_model('heart_scale.model', m)
        >>> m = load_model('heart_scale.model')
        >>> p_label, p_acc, p_val = predict(y, x, m, '-b 1')
        >>> ACC, MSE, SCC = evaluations(y, p_label)
        
        # Getting online help
        >>> help(train)
        
        The low-level use directly calls C interfaces imported by liblinear.py. Note that
        all arguments and return values are in ctypes format. You need to handle them
        carefully.
        
        >>> from liblinear.liblinear import *
        >>> prob = problem([1,-1], [{1:1, 3:1}, {1:-1,3:-1}])
        >>> param = parameter('-c 4')
        >>> m = liblinear.train(prob, param) # m is a ctype pointer to a model
        # Convert a Python-format instance to feature_nodearray, a ctypes structure
        >>> x0, max_idx = gen_feature_nodearray({1:1, 3:1})
        >>> label = liblinear.predict(m, x0)
        
        Quick Start with Scipy
        ======================
        
        Make sure you have Scipy installed to proceed in this section.
        If numba (http://numba.pydata.org) is installed, some operations will be much faster.
        
        There are two levels of usage. The high-level one uses utility functions
        in liblinearutil.py and the usage is the same as the LIBLINEAR MATLAB interface.
        
        >>> import scipy
        >>> from liblinear.liblinearutil import *
        # Read data in LIBSVM format
        >>> y, x = svm_read_problem('../heart_scale', return_scipy = True) # y: ndarray, x: csr_matrix
        >>> m = train(y[:200], x[:200, :], '-c 4')
        >>> p_label, p_acc, p_val = predict(y[200:], x[200:, :], m)
        
        # Construct problem in Scipy format
        # Dense data: numpy ndarray
        >>> y, x = scipy.asarray([1,-1]), scipy.asarray([[1,0,1], [-1,0,-1]])
        # Sparse data: scipy csr_matrix((data, (row_ind, col_ind))
        >>> y, x = scipy.asarray([1,-1]), scipy.sparse.csr_matrix(([1, 1, -1, -1], ([0, 0, 1, 1], [0, 2, 0, 2])))
        >>> prob  = problem(y, x)
        >>> param = parameter('-s 0 -c 4 -B 1')
        >>> m = train(prob, param)
        
        # Apply data scaling in Scipy format
        >>> y, x = svm_read_problem('../heart_scale', return_scipy=True)
        >>> scale_param = csr_find_scale_param(x, lower=0)
        >>> scaled_x = csr_scale(x, scale_param)
        
        # Other utility functions
        >>> save_model('heart_scale.model', m)
        >>> m = load_model('heart_scale.model')
        >>> p_label, p_acc, p_val = predict(y, x, m, '-b 1')
        >>> ACC, MSE, SCC = evaluations(y, p_label)
        
        # Getting online help
        >>> help(train)
        
        The low-level use directly calls C interfaces imported by liblinear.py. Note that
        all arguments and return values are in ctypes format. You need to handle them
        carefully.
        
        >>> from liblinear.liblinear import *
        >>> prob = problem(scipy.asarray([1,-1]), scipy.sparse.csr_matrix(([1, 1, -1, -1], ([0, 0, 1, 1], [0, 2, 0, 2]))))
        >>> param = parameter('-c 4')
        >>> m = liblinear.train(prob, param) # m is a ctype pointer to a model
        # Convert a tuple of ndarray (index, data) to feature_nodearray, a ctypes structure
        # Note that index starts from 0, though the following example will be changed to 1:1, 3:1 internally
        >>> x0, max_idx = gen_feature_nodearray((scipy.asarray([0,2]), scipy.asarray([1,1])))
        >>> label = liblinear.predict(m, x0)
        
        Design Description
        ==================
        
        There are two files liblinear.py and liblinearutil.py, which respectively correspond to
        low-level and high-level use of the interface.
        
        In liblinear.py, we adopt the Python built-in library "ctypes," so that
        Python can directly access C structures and interface functions defined
        in linear.h.
        
        While advanced users can use structures/functions in liblinear.py, to
        avoid handling ctypes structures, in liblinearutil.py we provide some easy-to-use
        functions. The usage is similar to LIBLINEAR MATLAB interface.
        
        Data Structures
        ===============
        
        Three data structures derived from linear.h are node, problem, and
        parameter. They all contain fields with the same names in
        linear.h. Access these fields carefully because you directly use a C structure
        instead of a Python object. The following description introduces additional
        fields and methods.
        
        Before using the data structures, execute the following command to load the
        LIBLINEAR shared library:
        
            >>> from liblinear.liblinear import *
        
        - class feature_node:
        
            Construct a feature_node.
        
            >>> node = feature_node(idx, val)
        
            idx: an integer indicates the feature index.
        
            val: a float indicates the feature value.
        
            Show the index and the value of a node.
        
            >>> print(node)
        
        - Function: gen_feature_nodearray(xi [,feature_max=None])
        
            Generate a feature vector from a Python list/tuple/dictionary, numpy ndarray or tuple of (index, data):
        
            >>> xi_ctype, max_idx = gen_feature_nodearray({1:1, 3:1, 5:-2})
        
            xi_ctype: the returned feature_nodearray (a ctypes structure)
        
            max_idx: the maximal feature index of xi
        
            feature_max: if feature_max is assigned, features with indices larger than
                         feature_max are removed.
        
        - class problem:
        
            Construct a problem instance
        
            >>> prob = problem(y, x [,bias=-1])
        
            y: a Python list/tuple/ndarray of l labels (type must be int/double).
        
            x: 1. a list/tuple of l training instances. Feature vector of
                  each training instance is a list/tuple or dictionary.
        
               2. an l * n numpy ndarray or scipy spmatrix (n: number of features).
        
            bias: if bias >= 0, instance x becomes [x; bias]; if < 0, no bias term
                  added (default -1)
        
            You can also modify the bias value by
        
            >>> prob.set_bias(1)
        
            Note that if your x contains sparse data (i.e., dictionary), the internal
            ctypes data format is still sparse.
        
        - class parameter:
        
            Construct a parameter instance
        
            >>> param = parameter('training_options')
        
            If 'training_options' is empty, LIBLINEAR default values are applied.
        
            Set param to LIBLINEAR default values.
        
            >>> param.set_to_default_values()
        
            Parse a string of options.
        
            >>> param.parse_options('training_options')
        
            Show values of parameters.
        
            >>> print(param)
        
        - class model:
        
            There are two ways to obtain an instance of model:
        
            >>> model_ = train(y, x)
            >>> model_ = load_model('model_file_name')
        
            Note that the returned structure of interface functions
            liblinear.train and liblinear.load_model is a ctypes pointer of
            model, which is different from the model object returned
            by train and load_model in liblinearutil.py. We provide a
            function toPyModel for the conversion:
        
            >>> model_ptr = liblinear.train(prob, param)
            >>> model_ = toPyModel(model_ptr)
        
            If you obtain a model in a way other than the above approaches,
            handle it carefully to avoid memory leak or segmentation fault.
        
            Some interface functions to access LIBLINEAR models are wrapped as
            members of the class model:
        
            >>> nr_feature =  model_.get_nr_feature()
            >>> nr_class = model_.get_nr_class()
            >>> class_labels = model_.get_labels()
            >>> is_prob_model = model_.is_probability_model()
            >>> is_regression_model = model_.is_regression_model()
        
            The decision function is W*x + b, where
                W is an nr_class-by-nr_feature matrix, and
                b is a vector of size nr_class.
            To access W_kj (i.e., coefficient for the k-th class and the j-th feature)
            and b_k (i.e., bias for the k-th class), use the following functions.
        
            >>> W_kj = model_.get_decfun_coef(feat_idx=j, label_idx=k)
            >>> b_k = model_.get_decfun_bias(label_idx=k)
        
            We also provide a function to extract w_k (i.e., the k-th row of W) and
            b_k directly as follows.
        
            >>> [w_k, b_k] = model_.get_decfun(label_idx=k)
        
            Note that w_k is a Python list of length nr_feature, which means that
                w_k[0] = W_k1.
            For regression models, W is just a vector of length nr_feature. Either
            set label_idx=0 or omit the label_idx parameter to access the coefficients.
        
            >>> W_j = model_.get_decfun_coef(feat_idx=j)
            >>> b = model_.get_decfun_bias()
            >>> [W, b] = model_.get_decfun()
        
            For one-class SVM models, label_idx is ignored and b=-rho is
            returned from get_decfun(). That is, the decision function is
            w*x+b = w*x-rho.
        
            >>> rho = model_.get_decfun_rho()
            >>> [W, b] = model_.get_decfun()
        
            Note that in get_decfun_coef, get_decfun_bias, and get_decfun, feat_idx
            starts from 1, while label_idx starts from 0. If label_idx is not in the
            valid range (0 to nr_class-1), then a NaN will be returned; and if feat_idx
            is not in the valid range (1 to nr_feature), then a zero value will be
            returned. For regression models, label_idx is ignored.
        
        Utility Functions
        =================
        
        To use utility functions, type
        
            >>> from liblinear.liblinearutil import *
        
        The above command loads
            train()            : train a linear model
            predict()          : predict testing data
            svm_read_problem() : read the data from a LIBSVM-format file.
            load_model()       : load a LIBLINEAR model.
            save_model()       : save model to a file.
            evaluations()      : evaluate prediction results.
        
        - Function: train
        
            There are three ways to call train()
        
            >>> model = train(y, x [, 'training_options'])
            >>> model = train(prob [, 'training_options'])
            >>> model = train(prob, param)
        
            y: a list/tuple/ndarray of l training labels (type must be int/double).
        
            x: 1. a list/tuple of l training instances. Feature vector of
                  each training instance is a list/tuple or dictionary.
        
               2. an l * n numpy ndarray or scipy spmatrix (n: number of features).
        
            training_options: a string in the same form as that for LIBLINEAR command
                              mode.
        
            prob: a problem instance generated by calling
                  problem(y, x).
        
            param: a parameter instance generated by calling
                   parameter('training_options')
        
            model: the returned model instance. See linear.h for details of this
                   structure. If '-v' is specified, cross validation is
                   conducted and the returned model is just a scalar: cross-validation
                   accuracy for classification and mean-squared error for regression.
        
                   If the '-C' option is specified, best parameters are found
                   by cross validation. The parameter selection utility is supported
                   only by -s 0, -s 2 (for finding C) and -s 11 (for finding C, p).
                   The returned structure is a triple with the best C, the best p,
                   and the corresponding cross-validation accuracy or mean squared
                   error. The returned best p for -s 0 and -s 2 is set to -1 because
                   the p parameter is not used by classification models.
        
        
            To train the same data many times with different
            parameters, the second and the third ways should be faster..
        
            Examples:
        
            >>> y, x = svm_read_problem('../heart_scale')
            >>> prob = problem(y, x)
            >>> param = parameter('-s 3 -c 5 -q')
            >>> m = train(y, x, '-c 5')
            >>> m = train(prob, '-w1 5 -c 5')
            >>> m = train(prob, param)
            >>> CV_ACC = train(y, x, '-v 3')
            >>> best_C, best_p, best_rate = train(y, x, '-C -s 0') # best_p is only for -s 11
            >>> m = train(y, x, '-c {0} -s 0'.format(best_C)) # use the same solver: -s 0
        
        - Function: predict
        
            To predict testing data with a model, use
        
            >>> p_labs, p_acc, p_vals = predict(y, x, model [,'predicting_options'])
        
            y: a list/tuple/ndarray of l true labels (type must be int/double).
               It is used for calculating the accuracy. Use [] if true labels are
               unavailable.
        
            x: 1. a list/tuple of l training instances. Feature vector of
                  each training instance is a list/tuple or dictionary.
        
               2. an l * n numpy ndarray or scipy spmatrix (n: number of features).
        
            predicting_options: a string of predicting options in the same format as
                                that of LIBLINEAR.
        
            model: a model instance.
        
            p_labels: a list of predicted labels
        
            p_acc: a tuple including accuracy (for classification), mean
                   squared error, and squared correlation coefficient (for
                   regression).
        
            p_vals: a list of decision values or probability estimates (if '-b 1'
                    is specified). If k is the number of classes, for decision values,
                    each element includes results of predicting k binary-class
                    SVMs. If k = 2 and solver is not MCSVM_CS, only one decision value
                    is returned. For probabilities, each element contains k values
                    indicating the probability that the testing instance is in each class.
                    Note that the order of classes here is the same as 'model.label'
                    field in the model structure.
        
            Example:
        
            >>> m = train(y, x, '-c 5')
            >>> p_labels, p_acc, p_vals = predict(y, x, m)
        
        - Functions: svm_read_problem/load_model/save_model
        
            See the usage by examples:
        
            >>> y, x = svm_read_problem('data.txt')
            >>> m = load_model('model_file')
            >>> save_model('model_file', m)
        
        - Function: evaluations
        
            Calculate some evaluations using the true values (ty) and the predicted
            values (pv):
        
            >>> (ACC, MSE, SCC) = evaluations(ty, pv, useScipy)
        
            ty: a list/tuple/ndarray of true values.
        
            pv: a list/tuple/ndarray of predicted values.
        
            useScipy: convert ty, pv to ndarray, and use scipy functions to do the evaluation
        
            ACC: accuracy.
        
            MSE: mean squared error.
        
            SCC: squared correlation coefficient.
        
        - Function: csr_find_scale_parameter/csr_scale
        
            Scale data in csr format.
        
            >>> param = csr_find_scale_param(x [, lower=l, upper=u])
            >>> x = csr_scale(x, param)
        
            x: a csr_matrix of data.
        
            l: x scaling lower limit; default -1.
        
            u: x scaling upper limit; default 1.
        
            The scaling process is: x * diag(coef) + ones(l, 1) * offset'
        
            param: a dictionary of scaling parameters, where param['coef'] = coef and param['offset'] = offset.
        
            coef: a scipy array of scaling coefficients.
        
            offset: a scipy array of scaling offsets.
        
        Additional Information
        ======================
        
        This interface was originally written by Hsiang-Fu Yu from Department of Computer
        Science, National Taiwan University. If you find this tool useful, please
        cite LIBLINEAR as follows
        
        R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin.
        LIBLINEAR: A Library for Large Linear Classification, Journal of
        Machine Learning Research 9(2008), 1871-1874. Software available at
        http://www.csie.ntu.edu.tw/~cjlin/liblinear
        
        For any question, please contact Chih-Jen Lin <cjlin@csie.ntu.edu.tw>,
        or check the FAQ page:
        
        http://www.csie.ntu.edu.tw/~cjlin/liblinear/faq.html
        
Platform: UNKNOWN
Description-Content-Type: text/plain
