= Stata Python =

Stata supports calling out to an embedded [[Python]] interpretter.

<<TableOfContents>>

----



== Installation ==

Most system configuration is done with the '''`python set`''' command.

Stata can list recognized Python environments with `python search`. To add an unrecognized environment, try:

{{{
python set exec "C:\path\to\python\installation"
}}}

To prepend or append to the `PYTHONPATH`, use:

{{{
python set userpath "C:\foo" "C:\bar" "C:\baz", [prepend]
}}}

To make these settings permanent, add the '''`permanent`''' option.

----



== Usage ==



=== Interactive Prompt ===

Within a Stata interactive session, enter a Python interactive subsession with the `python` command. For example:

Stata local variables are accessed with quotations.

{{{
. local int_var = 3
. local str_var = "This is a Stata string"
. python
---------------------------------------- python (type end to exit) -----------
>>> `int_var'
3
>>> "`str_var'".split(" ")
['This', 'is', 'a', 'Stata', 'string']
}}}

A Stata command can be used by prefixing with `stata:`.

{{{
>>> stata: webuse auto, clear
}}}



=== Scope ===

To interactively run a single Python command and immediately return to the Stata session, use the `python:` command instead. Note the colon (`:`).

{{{
python: print("Hello, world")
}}}



=== Programs ===

Use Python within an [[Stata/AdoFiles|ado file]] with the `python:` command. Much like a [[Stata/Programs|Stata program]] or an interactive Python session, all lines between `python:` and `end` will be interpretted by the Python subsession. For example:

{{{
python:

import sqlite3
import pandas as pd

con = sqlite3.connect("example.db")
df = pd.read_sql_query("SELECT * from example", con)
con.close()

end
}}}

Note that objects in the `__main__` namespace are retained across Python sessions. If the `con` [[Python/Sqlite3|sqlite3.Connection]] object was not closed, it would have remained in memory until the Stata process ended.



=== Interface Module ===

To move data between Python and Stata processes, use the `sfi` module.

{{{
python:

import pandas as pd
from sfi import Data

# initialize N cases
Data.setObsTotal(len(df))

# initialize variables
Data.addVarDouble("id")
Data.addVarStr("name",5)

# copy columnar data
Data.store("id", None, df["id"], None)
Data.store("zipcode", None, df["name"], None)

# free memory
del df

end
}}}

This module can be imported into both programs and interactive sessions. It is not a publicly available module.



=== Mixing Python and Stata Programs ===

When designing a generalized Python program for use from within Stata, the predominant design pattern is:

 * create an ado file and Stata program that handles all interfacing with the end user
 * define a minimal Python function
   * this function will live in the ado file's namespace, not `__main__`, so it won't pollute an end user's session
 * call the Python function with a scope

{{{
program varsum
    version 16.0
    syntax varname [if] [in]

    python: _varsum("`varlist'", "`touse'")
    display as txt " sum of ‘varlist’: " as res r(sum)
end

python:
from sfi import Data, Scalar
def _varsum(varname, touse):
    x = Data.get(varname, None, touse)
    Scalar.setValue("r(sum)", sum(x))
end
}}}

{{{
. webuse auto
(1978 Automobile Data)

. varsum price
sum of price: 456229

. varsum price if foreign
sum of price: 140463

}}}

----



== See also ==

[[https://www.stata.com/manuals/ppystataintegration.pdf|Stata manual on PyStata integration]]



----
CategoryRicottone