Differences between revisions 13 and 14
Revision 13 as of 2023-09-26 18:26:25
Size: 3711
Comment:
Revision 14 as of 2025-10-24 16:20:01
Size: 3359
Comment: Rewrite
Deletions are marked like this. Additions are marked like this.
Line 13: Line 13:
Most system configuration is done with the '''`python set`''' command.

Stata can
list recognized Python environments with `python search`. To add an unrecognized environment, try:
To list recognized Python environments, try `python search`. If an valid environment is known to exist but was not recognized, try:
Line 21: Line 19:
To prepend or append to the `PYTHONPATH`, use: To manipulate (i.e., append or prepend) the `PYTHONPATH` environment variable, try:
Line 24: Line 22:
python set userpath "C:\foo" "C:\bar" "C:\baz", [prepend] python set userpath "C:\foo" "C:\bar" "C:\baz"
python set userpath "C:\foo" "C:\bar" "C:\baz", prepend
Line 35: Line 34:


=== Interactive Prompt ===

Within a Stata interactive session, enter a Python interactive subsession with the `python` command. For example:

Stata local variables are accessed with quotations.

{{{
. local int_var = 3
. local str_var = "This is a Stata string"
. python
---------------------------------------- python (type end to exit) -----------
>>> `int_var'
3
>>> "`str_var'".split(" ")
['This', 'is', 'a', 'Stata', 'string']
}}}

A Stata command can be used by prefixing with `stata:`.

{{{
>>> stata: webuse auto, clear
}}}



=== Scope ===

To interactively run a single Python command and immediately return to the Stata session, use the `python:` command instead. Note the colon (`:`).
Much like [[Stata/Mata|Mata]], there is a '''Python scope'''.
Line 72: Line 42:
=== Subsessions ===

Within an interactive Stata session, enter a Python interactive subsession with the '''`-python-`''' command.

{{{
. // This is an interactive Stata session
. local int_var = 3

. local str_var = "This is a Stata string"

.
. // Now start the Python subsession
. python
---------------------------------------- python (type end to exit) -----------
>>>
>>> myint = 3
>>> mystr = "This is a Python string
>>>
>>> # Stata macros evaluate here
>>> `int_var'
3
>>> "`str_var'".split(" ")
['This', 'is', 'a', 'Stata', 'string']
>>>
>>> # Call back out to the Stata session
>>> stata: webuse auto, clear
}}}


Line 74: Line 74:
Use Python within an [[Stata/AdoFiles|ado file]] with the `python:` command. Much like a [[Stata/Programs|Stata program]] or an interactive Python session, all lines between `python:` and `end` will be interpretted by the Python subsession. For example: Within an [[Stata/AdoFiles|ado file]], run Python code like:
Line 78: Line 78:
Line 82: Line 81:
con = sqlite3.connect("example.db")
df = pd.read_sql_query("SELECT * from example", con)
con = sqlite3.connect("`a'")
df = pd.read_sql_query("SELECT * from `b'", con)
Line 85: Line 84:
Line 91: Line 89:


=== Interface Module ===

To move data between Python and Stata processes, use the `sfi` module.

{{{
python:

import pandas as pd
from sfi import Data

# initialize N cases
Data.setObsTotal(len(df))

# initialize variables
Data.addVarDouble("id")
Data.addVarStr("name",5)

# copy columnar data
Data.store("id", None, df["id"], None)
Data.store("zipcode", None, df["name"], None)

# free memory
del df

end
}}}

This module can be imported into both programs and interactive sessions. It is not a publicly available module.



=== Mixing Python and Stata Programs ===

When designing a generalized Python program for use from within Stata, the predominant design pattern is:

 * create an ado file and Stata program that handles all interfacing with the end user
 * define a minimal Python function
   * this function will live in the ado file's namespace, not `__main__`, so it won't pollute an end user's session
 * call the Python function with a scope
Best practices for bundling Python in an ado file are:
 * Write as a function to be called, not as a script that evaluates immediately
 * Write an accompanying Stata program to wrap it, and handle everything user-facing there
 * Call the function from a scope
Line 166: Line 127:
== Stata Interoperability ==

To move data between Python and Stata processes, use the `sfi` module.

{{{
python:

import pandas as pd
from sfi import Data

# initialize N cases
Data.setObsTotal(len(df))

# initialize variables
Data.addVarDouble("id")
Data.addVarStr("name",5)

# copy columnar data
Data.store("id", None, df["id"], None)
Data.store("zipcode", None, df["name"], None)

# free memory
del df

end
}}}

This module can be imported into both programs and interactive sessions. It is not a publicly available module.

----


Stata Python

Stata supports calling out to an embedded Python interpretter.


Installation

To list recognized Python environments, try python search. If an valid environment is known to exist but was not recognized, try:

python set exec "C:\path\to\python\installation"

To manipulate (i.e., append or prepend) the PYTHONPATH environment variable, try:

python set userpath "C:\foo" "C:\bar" "C:\baz"
python set userpath "C:\foo" "C:\bar" "C:\baz", prepend

To make these settings permanent, add the permanent option.


Usage

Much like Mata, there is a Python scope.

python: print("Hello, world")

Subsessions

Within an interactive Stata session, enter a Python interactive subsession with the -python- command.

. // This is an interactive Stata session
. local int_var = 3

. local str_var = "This is a Stata string"

.
. // Now start the Python subsession
. python
---------------------------------------- python (type end to exit) -----------
>>>
>>> myint = 3
>>> mystr = "This is a Python string
>>>
>>> # Stata macros evaluate here
>>> `int_var'
3
>>> "`str_var'".split(" ")
['This', 'is', 'a', 'Stata', 'string']
>>>
>>> # Call back out to the Stata session
>>> stata: webuse auto, clear

Programs

Within an ado file, run Python code like:

python:
import sqlite3
import pandas as pd

con = sqlite3.connect("`a'")
df = pd.read_sql_query("SELECT * from `b'", con)
con.close()
end

Note that objects in the __main__ namespace are retained across Python sessions. If the con sqlite3.Connection object was not closed, it would have remained in memory until the Stata process ended.

Best practices for bundling Python in an ado file are:

  • Write as a function to be called, not as a script that evaluates immediately
  • Write an accompanying Stata program to wrap it, and handle everything user-facing there
  • Call the function from a scope

program varsum
    version 16.0
    syntax varname [if] [in]

    python: _varsum("`varlist'", "`touse'")
    display as txt " sum of ‘varlist’: " as res r(sum)
end

python:
from sfi import Data, Scalar
def _varsum(varname, touse):
    x = Data.get(varname, None, touse)
    Scalar.setValue("r(sum)", sum(x))
end

. webuse auto
(1978 Automobile Data)

. varsum price
sum of price: 456229

. varsum price if foreign
sum of price: 140463


Stata Interoperability

To move data between Python and Stata processes, use the sfi module.

python:

import pandas as pd
from sfi import Data

# initialize N cases
Data.setObsTotal(len(df))

# initialize variables
Data.addVarDouble("id")
Data.addVarStr("name",5)

# copy columnar data
Data.store("id", None, df["id"], None)
Data.store("zipcode", None, df["name"], None)

# free memory
del df

end

This module can be imported into both programs and interactive sessions. It is not a publicly available module.


See also

Stata manual on PyStata integration


CategoryRicottone

Stata/Python (last edited 2025-10-24 16:20:01 by DominicRicottone)