Differences between revisions 2 and 3
Revision 2 as of 2023-06-08 00:38:05
Size: 2316
Comment:
Revision 3 as of 2025-10-24 15:29:32
Size: 3662
Comment: Rewrite
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
Stata supports an interpreted language with a matrix-based data model called '''Mata'''. '''Mata''' is a matrix programming language that is bundled with Stata.
Line 11: Line 11:
== Data Types == == Data types ==
Line 13: Line 13:
To instantiate a '''scalar''', the Mata concept for constants, try: '''Scalars''' store a single value, either numeric or string. To declare a scalar, try:
Line 20: Line 20:
To instantiate a '''row vector''', try: '''Vectors''' are collections of values. All elements must be of the same data type; numeric and string values cannot be mixed.

To declare a '''row vector''', try any of:
Line 23: Line 25:
f=(1,2,3) f=(1, 2)
Line 25: Line 27:
f=("A","B","C") f=("A", "B", "C")
Line 28: Line 30:
Or for a '''column vector''', try: To declare a '''column vector''', try any of:
Line 31: Line 33:
g=(3\ 4 \5) g=(1 \ 2)
Line 33: Line 35:
g=("A"\ "B"\ "C") g=("A" \ "B" \ "C")
Line 36: Line 38:
'''Matrices''' use the logical combination of this syntax. Vector elements can be accessed by subscripting, with indexing starting at 1. Taking the above examples, `g[1]` returns 1.

To declare a '''matrix''', combine these syntaxes.
Line 39: Line 43:
h=(1,2,3\ 4,5,6\ 7,8,9)
h=("A","B"\ "C","D")
h=(1, 2 \ 3, 4)
h=("A", "B" \ "C", "D")
Line 43: Line 47:
To combine vectors or matrices, try: Matrix elements can be accessed by subscripting: row then column. Taking the above examples, `h[2,2]` returns 4. Furthermore, rows and columns of a matrix can be accessed like `h[2,]` or `h[,2]` (respectively).

To horizontally join vectors, try:
Line 46: Line 52:
x=(1,2\ 6,7)
y=(3,4,5\ 8,9,10)
z=(x,y)
x=(h, g)
Line 50: Line 54:

To vertically join vectors, try:

{{{
y=(h \ f)
}}}

----
Line 53: Line 65:
=== Matrix generation functions === == Operators ==
Line 55: Line 67:
''`J(r, c, a)`'' returns an ''`r x c`'' matrix with every element being ''`a`''. Most operators work as expected, e.g. `X*b` performs [[LinearAlgebra/MatrixMultiplication|matrix multiplication]].
Line 57: Line 69:
''`I(r)`'' returns an ''`r x r`'' identity matrix. New operators include:
 * apostrophe (`'`) for [[LinearAlgebra/Transposition|transposition]], as in `X'`

----



== Built-in functions ==

These functions operate on a matrix:

||'''Name''' ||'''Meaning''' ||
||`det(m)` ||[[LinearAlgebra/Determinant|determinant]] ||
||`trace(m)` ||[[LinearAlgebra/Trace|trace]] ||
||`inv(m)` ||[[LinearAlgebra/Invertibility|invert]] ||
||`invsym(m)` ||invert a [[LinearAlgebra/SpecialMatrices#Symmetric_Matrices|symmetric matrix]] ''m''||

These functions construct a new matrix:

||'''Name''' ||'''Meaning''' ||
||`I(n)` ||return the ''n x n'' [[LinearAlgebra/SpecialMatrices#Identity_Matrix|identity matrix]]||
||`J(r, c, x)`||return a ''r x c'' matrix with all elements set to `x` ||
Line 65: Line 98:


=== Reading data ===

To read data from Stata's active session into a matrix, try:
To copy values from Stata's active, in-memory dataset into a matrix, try:
Line 72: Line 101:
q=st_data(., "numeric_var_name")
r=st_data(., ("numeric_var_name"))
s=st_data(,. ("numeric_var_1", "numeric_var_2", "numeric_var_3"))
t=st_sdata(., ("string_var_name"))
mymat=st_data(., ("foo", "bar", "baz"))
Line 78: Line 104:
Note that string data needs to be read using ''st_sdata'', not ''st_data''. The arguments to this function are vectors; the first is a column vector selecting rows of the dataset, while the second is a row vector selecting columns of the dataset. Generally the first argument is left as a missing value to mean copy all rows. Columns can be identified by either variable name or ordinal index. (Identification by index is technically faster.)
Line 80: Line 106:
The omission of the first parameter ''(or rather, supplying the missing value as the first parameter)'' indicates that we want ''all'' rows. Note that string variables require the `st_sdata` function instead. It is used in the same way, however.
Line 82: Line 108:
The second parameter can similarly be omitted, though this isn't as useful. Column selectors can either me numeric indices or variable names. Note that selection by numeric indices is faster computationally.



=== Writing data ===

To write a matrix into a variable in Stata's active session, try:
Note also that a different syntax can be used to copy a single variable. The following are equivalent:
Line 91: Line 111:
st_store(., "numeric_var_name", numeric_matrix_name)
st_store(., ("numeric_var_name"), numeric_matrix_name)
st_store(., ("numeric_var_1", "numeric_var_2", "numeric_var_3"), numeric_matrix_name)
st_sstore(., "string_var_name", string_matrix_name)
mymat=st_data(., ("foo"))
mymat=st_data(., "foo")
Line 97: Line 115:
Note that string data needs to be written using ''st_sstore'', not ''st_store''. To then copy values from a matrix into Stata's active dataset, try:
Line 99: Line 117:
Parameters are specified in the same way as ''st_data''. Note that the function will raise an error if the selected matrix is not '''p-conformable''' ''(i.e. same dimensions)'' with the active session's data, so this flexibility is rarely useful. {{{
st_store(., ("foo", "bar", "baz"), mymat)
}}}

As before, string variables require the `st_sstore` function instead.

This function can raise an error if the selected matrix is not p-conformable (i.e., not the same dimensions) with the active dataset. Generally, it is necessary to clear the active dataset before attempting to copy values back.

Mata

Mata is a matrix programming language that is bundled with Stata.


Data types

Scalars store a single value, either numeric or string. To declare a scalar, try:

a=1
a="A"

Vectors are collections of values. All elements must be of the same data type; numeric and string values cannot be mixed.

To declare a row vector, try any of:

f=(1, 2)
f=(1::100)
f=("A", "B", "C")

To declare a column vector, try any of:

g=(1 \ 2)
g=(1:100)
g=("A" \ "B" \ "C")

Vector elements can be accessed by subscripting, with indexing starting at 1. Taking the above examples, g[1] returns 1.

To declare a matrix, combine these syntaxes.

h=(1, 2 \ 3, 4)
h=("A", "B" \ "C", "D")

Matrix elements can be accessed by subscripting: row then column. Taking the above examples, h[2,2] returns 4. Furthermore, rows and columns of a matrix can be accessed like h[2,] or h[,2] (respectively).

To horizontally join vectors, try:

x=(h, g)

To vertically join vectors, try:

y=(h \ f)


Operators

Most operators work as expected, e.g. X*b performs matrix multiplication.

New operators include:


Built-in functions

These functions operate on a matrix:

Name

Meaning

det(m)

determinant

trace(m)

trace

inv(m)

invert

invsym(m)

invert a symmetric matrix m

These functions construct a new matrix:

Name

Meaning

I(n)

return the n x n identity matrix

J(r, c, x)

return a r x c matrix with all elements set to x


Stata Interoperability

To copy values from Stata's active, in-memory dataset into a matrix, try:

mymat=st_data(., ("foo", "bar", "baz"))

The arguments to this function are vectors; the first is a column vector selecting rows of the dataset, while the second is a row vector selecting columns of the dataset. Generally the first argument is left as a missing value to mean copy all rows. Columns can be identified by either variable name or ordinal index. (Identification by index is technically faster.)

Note that string variables require the st_sdata function instead. It is used in the same way, however.

Note also that a different syntax can be used to copy a single variable. The following are equivalent:

mymat=st_data(., ("foo"))
mymat=st_data(., "foo")

To then copy values from a matrix into Stata's active dataset, try:

st_store(., ("foo", "bar", "baz"), mymat)

As before, string variables require the st_sstore function instead.

This function can raise an error if the selected matrix is not p-conformable (i.e., not the same dimensions) with the active dataset. Generally, it is necessary to clear the active dataset before attempting to copy values back.


CategoryRicottone

Stata/Mata (last edited 2025-10-24 15:29:32 by DominicRicottone)