Differences between revisions 6 and 14 (spanning 8 versions)
Revision 6 as of 2022-09-24 20:20:46
Size: 2480
Comment:
Revision 14 as of 2025-10-24 16:20:01
Size: 3359
Comment: Rewrite
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
## page was renamed from StataPython
Line 3: Line 2:

Stata supports calling out to an embedded [[Python]] interpretter.
Line 12: Line 13:
The path to the Python executable is set using: To list recognized Python environments, try `python search`. If an valid environment is known to exist but was not recognized, try:
Line 15: Line 16:
python set exec "path_string" python set exec "C:\path\to\python\installation"
Line 18: Line 19:
Stata can list recognized Python environments with `python search`.

To prepend or append to the `PYTHONPATH`, use:
To manipulate (i.e., append or prepend) the `PYTHONPATH` environment variable, try:
Line 23: Line 22:
python set userpath "path_string" [...], [prepend] python set userpath "C:\foo" "C:\bar" "C:\baz"
python set userpath "C:\foo" "C:\bar" "C:\baz", prepend
Line 26: Line 26:
To make these settings permanent, use the `permanent` option. To make these settings permanent, add the '''`permanent`''' option.
Line 34: Line 34:
=== Python REPL ===

A Python REPL is entered with the `python` command. It prints to the screen a reminder that the `end` command is used to exit the environment.

Stata local variables are accessed with quotations.
Much like [[Stata/Mata|Mata]], there is a '''Python scope'''.
Line 41: Line 37:
python: print("Hello, world")
}}}



=== Subsessions ===

Within an interactive Stata session, enter a Python interactive subsession with the '''`-python-`''' command.

{{{
. // This is an interactive Stata session
Line 42: Line 49:
Line 43: Line 51:

.
. // Now start the Python subsession
Line 45: Line 56:
>>>
>>> myint = 3
>>> mystr = "This is a Python string
>>>
>>> # Stata macros evaluate here
Line 49: Line 65:
}}}

Within the REPL, the `stata` context submits commands to the parent Stata shell.

{{{
>>> stata: sysuse auto, clear
>>>
>>> # Call back out to the Stata session
>>> stata: webuse auto, clear
Line 59: Line 72:
=== Python Scope === === Programs ===
Line 61: Line 74:
A Python scope can be entered with `python:` (similar to defining function in Stata). This Python environment is exited upon leaving the scope, i.e. the end of the statement. To submit multiple statements, delimit them with semicolons.



=== Python SFI Module ===

A foreign function interface module is also available.
Within an [[Stata/AdoFiles|ado file]], run Python code like:
Line 70: Line 77:
from sfi import Data
pymake = Data.get('make')
# do something
}}}
python:
import sqlite3
import pandas as pd
Line 75: Line 81:
This can even be used within the Python REPL that is running within the Stata shell.



=== Stata Do Files ===

Python scopes within a Do file are entered with `python:` and terminated with `end`.

{{{
sysuse auto, clear
python:
from sfi import Data
pymake = Data.get(’make’)
# do something
con = sqlite3.connect("`a'")
df = pd.read_sql_query("SELECT * from `b'", con)
con.close()
Line 92: Line 87:
A common strategy is to define Python functions within a Do file, then create Stata functions to interface. Note that objects in the `__main__` namespace are retained across Python sessions. If the `con` [[Python/Sqlite3|sqlite3.Connection]] object was not closed, it would have remained in memory until the Stata process ended.

Best practices for bundling Python in an ado file are:
 * Write as a function to be called, not as a script that evaluates immediately
 * Write an accompanying Stata program to wrap it, and handle everything user-facing there
 * Call the function from a scope
Line 112: Line 112:
. sysuse auto, clear . webuse auto
Line 123: Line 123:
----



== Stata Interoperability ==

To move data between Python and Stata processes, use the `sfi` module.

{{{
python:

import pandas as pd
from sfi import Data

# initialize N cases
Data.setObsTotal(len(df))

# initialize variables
Data.addVarDouble("id")
Data.addVarStr("name",5)

# copy columnar data
Data.store("id", None, df["id"], None)
Data.store("zipcode", None, df["name"], None)

# free memory
del df

end
}}}

This module can be imported into both programs and interactive sessions. It is not a publicly available module.

----



== See also ==

[[https://www.stata.com/manuals/ppystataintegration.pdf|Stata manual on PyStata integration]]

Stata Python

Stata supports calling out to an embedded Python interpretter.


Installation

To list recognized Python environments, try python search. If an valid environment is known to exist but was not recognized, try:

python set exec "C:\path\to\python\installation"

To manipulate (i.e., append or prepend) the PYTHONPATH environment variable, try:

python set userpath "C:\foo" "C:\bar" "C:\baz"
python set userpath "C:\foo" "C:\bar" "C:\baz", prepend

To make these settings permanent, add the permanent option.


Usage

Much like Mata, there is a Python scope.

python: print("Hello, world")

Subsessions

Within an interactive Stata session, enter a Python interactive subsession with the -python- command.

. // This is an interactive Stata session
. local int_var = 3

. local str_var = "This is a Stata string"

.
. // Now start the Python subsession
. python
---------------------------------------- python (type end to exit) -----------
>>>
>>> myint = 3
>>> mystr = "This is a Python string
>>>
>>> # Stata macros evaluate here
>>> `int_var'
3
>>> "`str_var'".split(" ")
['This', 'is', 'a', 'Stata', 'string']
>>>
>>> # Call back out to the Stata session
>>> stata: webuse auto, clear

Programs

Within an ado file, run Python code like:

python:
import sqlite3
import pandas as pd

con = sqlite3.connect("`a'")
df = pd.read_sql_query("SELECT * from `b'", con)
con.close()
end

Note that objects in the __main__ namespace are retained across Python sessions. If the con sqlite3.Connection object was not closed, it would have remained in memory until the Stata process ended.

Best practices for bundling Python in an ado file are:

  • Write as a function to be called, not as a script that evaluates immediately
  • Write an accompanying Stata program to wrap it, and handle everything user-facing there
  • Call the function from a scope

program varsum
    version 16.0
    syntax varname [if] [in]

    python: _varsum("`varlist'", "`touse'")
    display as txt " sum of ‘varlist’: " as res r(sum)
end

python:
from sfi import Data, Scalar
def _varsum(varname, touse):
    x = Data.get(varname, None, touse)
    Scalar.setValue("r(sum)", sum(x))
end

. webuse auto
(1978 Automobile Data)

. varsum price
sum of price: 456229

. varsum price if foreign
sum of price: 140463


Stata Interoperability

To move data between Python and Stata processes, use the sfi module.

python:

import pandas as pd
from sfi import Data

# initialize N cases
Data.setObsTotal(len(df))

# initialize variables
Data.addVarDouble("id")
Data.addVarStr("name",5)

# copy columnar data
Data.store("id", None, df["id"], None)
Data.store("zipcode", None, df["name"], None)

# free memory
del df

end

This module can be imported into both programs and interactive sessions. It is not a publicly available module.


See also

Stata manual on PyStata integration


CategoryRicottone

Stata/Python (last edited 2025-10-24 16:20:01 by DominicRicottone)