Metadata-Version: 2.1
Name: marqo
Version: 0.1.7
Summary: Neural search for humans
Home-page: UNKNOWN
Author: marqo org
Author-email: org@marqo.io
License: UNKNOWN
Description: <p align="center">
          <img src="assets/logo.svg" alt="Marqo" width="150" height="150" />
        </p>
        
        <h1 align="center">Marqo</h1>
        
        <p align="center">
          <b>Neural search for humans.</b>
        </p>
        
        <p align="center">
          <a align="center" href="https://join.slack.com/t/marqo-community/shared_invite/zt-1d737l76e-u~b3Rvey2IN2nGM4wyr44w"><img src="https://img.shields.io/badge/Slack-blueviolet?logo=slack&amp;logoColor=white&style=flat-square"></a>
        </p>
        
        A deep-learning powered, open-source search engine which seamlessly integrates with your applications, websites, and workflow. 
        
        <!-- end marqo-description -->
        
        ## Get started
        
        1. Marqo requires docker. To install docker go to https://docs.docker.com/get-docker/
        2. Use docker to run [Opensearch](https://opensearch.org/):
        ```bash
        docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:2.1.0
        ```
        3. Install the Marqo client:
        ```bash
        pip install marqo
        ```
        4. Start indexing and searching! Let's look at a simple example below:
        
        ```python
        import marqo
        
        mq = marqo.Client(url='https://localhost:9200', main_user="admin", main_password="admin")
        
        mq.index("my-first-index").add_documents([
            {
                "Title": "The Travels of Marco Polo",
                "Description": "A 13th-century travelogue describing Polo's travels"
            }, 
            {
                "Title": "Extravehicular Mobility Unit (EMU)",
                "Description": "The EMU is a spacesuit that provides environmental protection, "
                               "mobility, life support, and communications for astronauts",
                "_id": "article_591"
            }]
        )
        
        results = mq.index("my-first-index").search(
            q="What is the best outfit to wear on the moon?"
        )
        
        ```
        
        - `mq` is the client that wraps the`marqo` API
        - `add_documents()` takes a list of documents, represented as python dicts, for indexing
        - `add_documents()` creates an index with default settings, if one does not already exist
        - You can optionally set a document's ID with the special `_id` field. Otherwise, marqo will generate one.
        - If the index doesn't exist, Marqo will create it. If it exists then Marqo will add the documents to the index.
        
        This should print output like this:
        
        
        ```python
        # let's print out the results:
        import pprint
        pprint.pprint(results)
        
        {
            'hits': [
                {   
                    'Title': 'Extravehicular Mobility Unit (EMU)',
                    'Description': 'The EMU is a spacesuit that provides environmental protection, mobility, life support, and' 
                                   'communications for astronauts',
                    '_highlights': {
                        'Description': 'The EMU is a spacesuit that provides environmental protection, '
                                       'mobility, life support, and communications for astronauts'
                    },
                    '_id': 'article_591',
                    '_score': 1.2387788
                }, 
                {   
                    'Title': 'The Travels of Marco Polo',
                    'Description': "A 13th-century travelogue describing Polo's travels",
                    '_highlights': {'Title': 'The Travels of Marco Polo'},
                    '_id': 'e00d1a8d-894c-41a1-8e3b-d8b2a8fce12a',
                    '_score': 1.2047464
                }
            ],
            'limit': 10,
            'processingTimeMs': 49,
            'query': 'What is the best outfit to wear on the moon?'
        }
        ```
        
        - Each hit corresponds to a document that matched the search query
        - They are ordered from most to least matching
        - `limit` is the maximum number of hits to be returned. This can be set as a parameter during search
        - Each hit has a `_highlights` field. This was the part of the document that matched the query the best
        
        
        ## Other basic operations
        
        ### Get document
        Retrieve a document by ID.
        
        ```python
        result = mq.index("my-first-index").get_document(document_id="e197e580-0393-4f4e-90e9-8cdf4b17e339")
        ```
        
        Note that by adding the document using ```add_documents``` again using the same ```_id``` will cause a document to be updated.
        
        ### Get Index
        Get data about an index.
        
        ```python
        results = mq.get_index("my-first-index")
        ```
        
        ### Delete Index
        Delete an index.
        
        ```python
        results = mq.index("my-first-index").delete()
        ```
        
        ### Lexical search
        Search using a BM25 query.
        
        ```python
        result = mq.index("my-first-index").search('marco polo', lexical=True)
        ```
        
        ### Search specific fields
        Search using a BM25 query.
        
        ```python
        result = mq.index("my-first-index").search('marco polo', searchable_attributes=['Title'])
        ```
        
        ## Multi modal and cross modal search
        
        To power image and text search, Marqo allows users to plug and play with CLIP models from HuggingFace. **Note that if you do not configure multi modal search, image urls will be treated as strings.** To start indexing and searching with images, first create an index with a CLIP configuration, as below:
        
        ```python
        
        settings = {
          "treat_urls_and_pointers_as_images":True,   # allows us to find an image file and index it 
          "model":"ViT-B/32"
        }
        response = client.create_index("my-multimodal-index", **settings)
        ```
        
        Images can then be added within documents as follows. You can use urls from the internet (for example S3) or from the disk of the machine:
        
        ```python
        
        responses = client.index("my-multimodal-index").add_documents([{
            "Image": "/mnt/images/spacesuit.png"
            "Description": "The EMU is a spacesuit that provides environmental protection, "
                           "mobility, life support, and communications for astronauts",
            "_id": "article_591"
        }], batch_size=50, use_parallel=True)
        
        ```
        
        You can then search using text as usual. To search specifically against the image attribute
        
        ```python
        
        results = client.index("my-multimodal-index").search('spaceman')
        
        ```
        
        ### Searching using an image
        Searching using an image can be achieved by providing the image link. In this example searchable_attributes is used to restrict the search just to the image data. You can use urls from the internet (for example S3) or from the disk of the machine:
        
        ```python
        results = client.index(index_name).search('https://api.claylings.io/api/image/190', searchable_attributes=['Image'])
        ```
        
        ## Warning
        
        Note that you should not run other applications on the Opensearch cluster as Marqo automatically changes and adapts the settings on the cluster.
        
        ## Contributors
        Marqo is a community project with the goal of making neural search accessible to the wider developer community. We are glad that you are interested in helping out! Please read [this](./CONTRIBUTING.md) to get started
        
        ## Dev set up
        1. Create a virtual env ```python -m venv ./venv```
        2. Activate the virtual environment ```source ./venv/bin/activate```
        3. Install requirements from the requirements file: ```pip install -r requirements.txt```
        4. Run tests by running the tox file. CD into this dir and then run "tox"
        5. If you update dependencies, make sure to delete the .tox dir and rerun
        
        ## Merge instructions:
        1. Run the full test suite (by using the command `tox` in this dir).
        2. Create a pull request with an attached github issue.
        
        The large data test will 
        build Marqo from the main branch and fill indices with data. Go through and test queries 
        against this data. https://github.com/S2Search/NeuralSearchLargeDataTest
        
        <!-- start support-pitch -->
        
        
        ## Support
        
        - Join our [Slack community](https://join.slack.com/t/marqo-community/shared_invite/zt-1d737l76e-u~b3Rvey2IN2nGM4wyr44w) and chat with other community members about ideas.
        - Marqo community meetings (coming soon!)
        
        <!-- end support-pitch -->
        
Keywords: search python marqo opensearch neural semantic vector embedding
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3.8
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Requires-Python: >=3
Description-Content-Type: text/markdown
