Get Started with Text Analytics Toolbox - MATLAB & …

Get Started with Text Analytics Toolbox

Semantic Text

.xls Mining latent

.doc

Information

PDF

Text Analytics ToolboxTM provides algorithms and visualizations for preprocessing, analyzing, and modeling text data. Models created with the toolbox can be used in applications such as sentiment analysis, predictive maintenance, and topic modeling.

Learn more at: products/text-analytics

Function Name

wordcloud wordCloudCounts textscatter textscatter3 heatmap histcounts discretize

Description

Create word cloud chart from bag-of-words or LDA model Count words for word cloud creation 2-D scatter plot of text 3-D scatter plot of text Create heatmap chart Histogram bin counts Group data into bins or categories

Visualize

Use word clouds and text scatter plots to summarize and validate results.

Model and Predict

Convert text into numeric representations using bag-of-words or pretrained word embedding models,

and apply specialized machine learning algorithms for prediction

and topic modeling.

Function Name

readWord E m bedding trainWord E m bedding word2vec/vec2word ldaModel lsaModel bagOfWords fitlda fitlsa predict fitdist fitrlinear fitclinear fitcecoc

Description

Read word embedding from text file Train word embedding Maps words to embedding vectors Latent Dirichlet allocation (LDA) model Latent semantic analysis (LSA) model Bag-of-words model Fit latent Dirichlet allocation (LDA) model Fit a latent semantic analysis (LSA) model Predict top LDA topics of documents Fit probability distribution object to data Fit linear regression model to high-dimensional data Fit linear classification model to high-dimensional data Fit multiclass models for classifiers



Function Name

extractFileText textscan readtable compose xlsread webread TabularTextDatastore FileDatastore SpreadsheetDatastore

Description

Read from PDF, Microsoft Word, and plain text Read formatted data from text file or string Create table from file Convert data into formatted string array Read Microsoft Excel spreadsheet file Read content from RESTful web service Datastore for tabular text files Datastore with custom file reader Datastore for spreadsheet files

.doc

.xls

PDF

Import

Extract text from Microsoft? Word? files, PDFs, text files,

and spreadsheets.

Preprocess

Remove less helpful artifacts such as common words, punctuation, and URLs and apply text normalization to stem words to their root word.

Function Name

tokenizedDocument normalizeWords bagOfWords stopWords context removeWords removeLongWords removeShortWords removeInfrequentWords erasePunctuation

Description

Split documents into collections of words Remove inflections from words using the Porter stemmer Bag-of-words model Stop word list Search documents for word occurrences in context Remove selected words from document or bag-of-words Remove long words from documents or bag-of-words Remove short words from documents or bag-of-words Remove words with low counts from bag-of-words model Erase punctuation from text and documents

Function Name

str = "Hello,world" str = ["Hello", "World"] str = string( C ) str2double strlength isstring join split splitlines replace contains erase extractBetween extractAfter extractBefore strcmp regexp

Description

Declare a string variable Declare a string array Convert a character vector C to a string Convert a string to double numbers Return the length of strings Determine if input is string array Combine strings Split strings in string array Split string at newline characters Find and replace substrings in string array Determine if pattern is in string Delete substrings within strings Extract substrings between indicators Extract substring after specified position Extract substring before specified position Compare strings Match regular expression (case sensitive)

"Hello,world"

String

Manipulate, compare, and store text data efficiently.

? 2019 The MathWorks, Inc. MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See trademarks for a list of additional trademarks. Other product or brand names may be trademarks or registered trademarks of their respective holders.



11/19

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download