Differences between revisions 6 and 13 (spanning 7 versions)
Revision 6 as of 2022-09-24 20:20:46
Size: 2480
Comment:
Revision 13 as of 2023-09-26 18:26:25
Size: 3711
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
## page was renamed from StataPython
Line 3: Line 2:

Stata supports calling out to an embedded [[Python]] interpretter.
Line 12: Line 13:
The path to the Python executable is set using: Most system configuration is done with the '''`python set`''' command.

Stata can list recognized Python environments with `python search`. To add an unrecognized environment, try:
Line 15: Line 18:
python set exec "path_string" python set exec "C:\path\to\python\installation"
Line 17: Line 20:

Stata can list recognized Python environments with `python search`.
Line 23: Line 24:
python set userpath "path_string" [...], [prepend] python set userpath "C:\foo" "C:\bar" "C:\baz", [prepend]
Line 26: Line 27:
To make these settings permanent, use the `permanent` option. To make these settings permanent, add the '''`permanent`''' option.
Line 34: Line 35:
=== Python REPL ===
Line 36: Line 36:
A Python REPL is entered with the `python` command. It prints to the screen a reminder that the `end` command is used to exit the environment.
=== Interactive Prompt ===

Within a Stata interactive session, enter a Python interactive subsession with the `python` command. For example:
Line 51: Line 54:
Within the REPL, the `stata` context submits commands to the parent Stata shell. A Stata command can be used by prefixing with `stata:`.
Line 54: Line 57:
>>> stata: sysuse auto, clear >>> stata: webuse auto, clear
Line 59: Line 62:
=== Python Scope === === Scope ===
Line 61: Line 64:
A Python scope can be entered with `python:` (similar to defining function in Stata). This Python environment is exited upon leaving the scope, i.e. the end of the statement. To submit multiple statements, delimit them with semicolons. To interactively run a single Python command and immediately return to the Stata session, use the `python:` command instead. Note the colon (`:`).

{{{
python: print("Hello, world")
}}}
Line 65: Line 72:
=== Python SFI Module === === Programs ===
Line 67: Line 74:
A foreign function interface module is also available. Use Python within an [[Stata/AdoFiles|ado file]] with the `python:` command. Much like a [[Stata/Programs|Stata program]] or an interactive Python session, all lines between `python:` and `end` will be interpretted by the Python subsession. For example:
Line 70: Line 77:
from sfi import Data
pymake = Data.get('make')
# do something
python:

import sqlite3
import pandas as pd

con = sqlite3.connect("example.db")
df = pd.read_sql_query("SELECT * from example", con)
con.close()

end
Line 75: Line 89:
This can even be used within the Python REPL that is running within the Stata shell. Note that objects in the `__main__` namespace are retained across Python sessions. If the `con` [[Python/Sqlite3|sqlite3.Connection]] object was not closed, it would have remained in memory until the Stata process ended.
Line 79: Line 93:
=== Stata Do Files === === Interface Module ===
Line 81: Line 95:
Python scopes within a Do file are entered with `python:` and terminated with `end`. To move data between Python and Stata processes, use the `sfi` module.
Line 84: Line 98:
sysuse auto, clear
Line 86: Line 99:

import pandas as pd
Line 87: Line 102:
pymake = Data.get(’make’)
# do something

# initialize N cases
Data.setObsTotal(len(df))

# initialize variables
Data.addVarDouble("id")
Data.addVarStr("name",5)

# copy columnar data
Data.store("id", None, df["id"], None)
Data.store("zipcode", None, df["name"], None)

# free memory
del df
Line 92: Line 120:
A common strategy is to define Python functions within a Do file, then create Stata functions to interface. This module can be imported into both programs and interactive sessions. It is not a publicly available module.



=== Mixing Python and Stata Programs ===

When designing a generalized Python program for use from within Stata, the predominant design pattern is:

 * create an ado file and Stata program that handles all interfacing with the end user
 * define a minimal Python function
   * this function will live in the ado file's namespace, not `__main__`, so it won't pollute an end user's session
 * call the Python function with a scope
Line 112: Line 151:
. sysuse auto, clear . webuse auto
Line 123: Line 162:
----



== See also ==

[[https://www.stata.com/manuals/ppystataintegration.pdf|Stata manual on PyStata integration]]

Stata Python

Stata supports calling out to an embedded Python interpretter.


Installation

Most system configuration is done with the python set command.

Stata can list recognized Python environments with python search. To add an unrecognized environment, try:

python set exec "C:\path\to\python\installation"

To prepend or append to the PYTHONPATH, use:

python set userpath "C:\foo" "C:\bar" "C:\baz", [prepend]

To make these settings permanent, add the permanent option.


Usage

Interactive Prompt

Within a Stata interactive session, enter a Python interactive subsession with the python command. For example:

Stata local variables are accessed with quotations.

. local int_var = 3
. local str_var = "This is a Stata string"
. python
---------------------------------------- python (type end to exit) -----------
>>> `int_var'
3
>>> "`str_var'".split(" ")
['This', 'is', 'a', 'Stata', 'string']

A Stata command can be used by prefixing with stata:.

>>> stata: webuse auto, clear

Scope

To interactively run a single Python command and immediately return to the Stata session, use the python: command instead. Note the colon (:).

python: print("Hello, world")

Programs

Use Python within an ado file with the python: command. Much like a Stata program or an interactive Python session, all lines between python: and end will be interpretted by the Python subsession. For example:

python:

import sqlite3
import pandas as pd

con = sqlite3.connect("example.db")
df = pd.read_sql_query("SELECT * from example", con)
con.close()

end

Note that objects in the __main__ namespace are retained across Python sessions. If the con sqlite3.Connection object was not closed, it would have remained in memory until the Stata process ended.

Interface Module

To move data between Python and Stata processes, use the sfi module.

python:

import pandas as pd
from sfi import Data

# initialize N cases
Data.setObsTotal(len(df))

# initialize variables
Data.addVarDouble("id")
Data.addVarStr("name",5)

# copy columnar data
Data.store("id", None, df["id"], None)
Data.store("zipcode", None, df["name"], None)

# free memory
del df

end

This module can be imported into both programs and interactive sessions. It is not a publicly available module.

Mixing Python and Stata Programs

When designing a generalized Python program for use from within Stata, the predominant design pattern is:

  • create an ado file and Stata program that handles all interfacing with the end user
  • define a minimal Python function
    • this function will live in the ado file's namespace, not __main__, so it won't pollute an end user's session

  • call the Python function with a scope

program varsum
    version 16.0
    syntax varname [if] [in]

    python: _varsum("`varlist'", "`touse'")
    display as txt " sum of ‘varlist’: " as res r(sum)
end

python:
from sfi import Data, Scalar
def _varsum(varname, touse):
    x = Data.get(varname, None, touse)
    Scalar.setValue("r(sum)", sum(x))
end

. webuse auto
(1978 Automobile Data)

. varsum price
sum of price: 456229

. varsum price if foreign
sum of price: 140463


See also

Stata manual on PyStata integration


CategoryRicottone

Stata/Python (last edited 2025-10-24 16:20:01 by DominicRicottone)