With our ever-increasing deep suite of connectors, Aspire 5.0 is an enabling technology to scan, analyze, search, and copy all types of data (structured & unstructured) within an organization, across thousands of data systems. It provides unprecedented visibility, analysis, and control over massive amounts of widely heterogeneous data.

With a decade of experience and with more than 150 customers, Aspire has proven to be a valuable solution for a varied number of use cases, such as NLP integration solutions with Saga, Enterprise search, unstructured content analytics, as well as certification and compliance processes. New components may also be developed and deployed using a robust and tested SDK.

Aspire has evolved over time, adapting to the everchanging technologies, trends and especially our customer’s problems. Aspire 5.0, our latest release, is a major architectural upgrade that enables new capabilities and technologies, making it easier than ever to maintain and automate the crawling and search procedures, while keeping the flexibility and power of our strong connector suite, even across hundreds of thousands of sources.

What’s new in Aspire 5?

Distributed local scanning

Customers have asked us to scan sensitive PII and PHI content without crossing borders; our new job routing policies allow controlled content ingestion and scanning to be executed in a specific geographic location, reducing network latency and very importantly, complying with laws requiring certain types of data not to exit borders, while being centrally managed in another country.

Scalability

Aspire has been used, for example, by customers with tens of thousands of shared drives and SharePoint site collections. To handle these scenarios, Accenture designed Aspire 5 with scaling in mind: native support for containerization (Docker and Kubernetes deployments), as well as shareable configuration components like server URLs, credentials, crawl behaviors and content processing rules, which can be configured using the new Admin UI or the REST API, allowing for easier configuration management on environments with thousands of starting points, known as seeds.

Throttling

One of our clients, a large Chemical Corporation, had struggled with the scanning of its vast library of sites due to the vendor service throttling policies, making it hard to coordinate when, how much and how often to scan. As is already known, the most widely used content repositories nowadays are provided by cloud-based vendors (like SharePoint, OneDrive, Box.com, S3, Confluence or Salesforce, to name a few), which impose requests limits on their public REST APIs (generating HTTP 429 responses), in order to ensure availability to all its customers. This can be hard or even impossible to predict.

Aspire offers a standard way of ensuring the rate of requests to the source repositories is always under control, no matter how many Aspire nodes are being used for parallel scanning, or even how many jobs are concurrently active at any given time.

Since these throttle mechanisms are enforced per account, client App ID, or even by the resource itself being scanned, Aspire 5 offers a way to limit requests based on shared Credentials, Connections, or a specific Seed (site, shared drive, bucket, etc.)

Scheduling

Our internal Accenture Corporate Search Site uses Aspire to ingest content from dozens of different repositories, and there is a complex process the data must go through before it is ready to be searched. This process involves many different scanning stages that must happen in a specific sequence. Aspire 5 introduces sequence schedules which enable the execution of multiple scanning jobs in sequenced chains, with the possibility of creating circular chains to ensure around-the-clock scanning.

Automation

Some customers adopt Aspire as part of a larger solution to classify data across their organizations, so it is often automatically updated by other systems. Aspire 5 has reinvented its interaction mechanism with these other systems by taking a REST API first design approach, where the REST APIs have been designed for easy adoption, and our new Admin UI was updated to use it.

The new REST API allows for automated configuration and deployment of content ingestion jobs, even from CI/CD pipelines.

Generic REST Connector

More often than ever, clients come to us with shorter schedules to deploy solutions. These include fetching data from systems for which Aspire does not yet have native connectors.

A new generic connector has been added to our library: the generic REST Connector, which enables faster implementations of new connectors by simply configuring it through the Aspire UI (no java code). The REST Connector can connect to any JSON-based REST Endpoint, scan for entities, extract metadata, fetch contents, and recursively call other endpoints.

Document Level Security & Identity Crawling

Document Level Security has always been an important part of any enterprise search solution, since it must ensure users can search and find only data they have access to, and cloud-based search services have also embraced this need.

Many times, we have seen the need for ingesting heterogenous user directories from different systems into cloud-based search services, or even on-premises search engines such as Elasticsearch. In addition, there’s the need to combine them into a holistic user/group database. The idea behind this is the need for the search application to understand user/group memberships and be able to trim the results based on the document’s ACLs, even for documents coming from different source repositories with different security identity systems (Azure AD, LDAP, etc.)

Aspire 5 has made this easier than ever by introducing a new type of crawl: Identity Crawls, which enables crawling the repository-specific identity information related to the documents ACLs, and it can later combine and expand it with other identity information with the Aspire Group Expansion process.

Here are some additional screenshots on our brand-new Web UI.

Visit our Aspire technical documentation for more details.

<<< Start >>>



<<< End >>>

This document is produced by consultants at Accenture as general guidance. It is not intended to provide specific advice on your circumstances. If you require advice or further details on any matters referred to, please contact your Accenture representative.

This document makes descriptive reference to trademarks that may be owned by others. The use of such trademarks herein is not an assertion of ownership of such trademarks by Accenture and is not intended to represent or imply the existence of an association between Accenture and the lawful owners of such trademarks. No sponsorship, endorsement, or approval of this content by the owners of such trademarks is intended, expressed, or implied.

Accenture provides the information on an “as-is” basis without representation or warranty and accepts no liability for any action or failure to act taken in response to the information contained or referenced in this publication.

Andres Umaña

Aspire Architect, Applied Intelligence

Subscription Center
Subscribe to Accenture's Search and Content Analytics Blog Subscribe to Accenture's Search and Content Analytics Blog