Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 76e936c

Browse files
committed
Python, doc: Add links to runs on LGTM.com
1 parent 91c0066 commit 76e936c

1 file changed

Lines changed: 44 additions & 16 deletions

File tree

docs/codeql/codeql-language-guides/analyzing-data-flow-in-python.rst

Lines changed: 44 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,8 @@ Python has builtin functionality for reading and writing files, such as the func
8989
call = API::moduleImport("os").getMember("open").getACall()
9090
select call.getArg(0)
9191
92+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/8635258505893505141/>`__. Two of the demo projects make use of this low-level API.
93+
9294
Unfortunately this will only give the expression in the argument, not the values which could be passed to it. So we use local data flow to find all expressions that flow into the argument:
9395

9496
.. code-block:: ql
@@ -99,11 +101,32 @@ Unfortunately this will only give the expression in the argument, not the values
99101
100102
from DataFlow::CallCfgNode call, DataFlow::ExprNode expr
101103
where
102-
call = API::moduleImport("os").getMember("open").getACall()
103-
and DataFlow::localFlow(expr, call.getArg(0))
104-
select expr
104+
call = API::moduleImport("os").getMember("open").getACall() and
105+
DataFlow::localFlow(expr, call.getArg(0))
106+
select call, expr
107+
108+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/8213643003890447109/>`__. Many expressions flow to the same call.
105109

106-
Then we can make the source more specific, for example a parameter to a function or method. This query finds instances where a parameter is used as the name when opening a file:
110+
We see that we get several data-flow nodes for an expression as it flows towards a call (notice repeated locations in the ``call`` column). We are mostly interested in the "first" of these, what might be called the local source for the file name. To restrict attention to such local sources, and to simultaneously make the analysis more performant, we have the QL class ``LocalSourceNode``:
111+
112+
.. code-block:: ql
113+
114+
import python
115+
import semmle.python.dataflow.new.DataFlow
116+
import semmle.python.ApiGraphs
117+
118+
from DataFlow::CallCfgNode call, DataFlow::ExprNode expr
119+
where
120+
call = API::moduleImport("os").getMember("open").getACall() and
121+
DataFlow::localFlow(expr, call.getArg(0)) and
122+
expr instanceof DataFlow::LocalSourceNode
123+
select call, expr
124+
125+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/2017139821928498055/>`__. We now mostly have one expression per call.
126+
127+
We still have some cases of more than one expression flowing to a call, but then they flow through different code paths (possibly due to control-flow splitting, as in the second case).
128+
129+
We can also make the source more specific, for example a parameter to a function or method. This query finds instances where a parameter is used as the name when opening a file:
107130

108131
.. code-block:: ql
109132
@@ -113,13 +136,13 @@ Then we can make the source more specific, for example a parameter to a function
113136
114137
from DataFlow::CallCfgNode call, DataFlow::ParameterNode p
115138
where
116-
call = API::moduleImport("os").getMember("open").getACall()
117-
and DataFlow::localFlow(p, call.getArg(0))
118-
select p, "Opening a file based on a parameter."
139+
call = API::moduleImport("os").getMember("open").getACall() and
140+
DataFlow::localFlow(p, call.getArg(0))
141+
select call, p
119142
120-
Using the exact name in the parameter may be too strict. If we want to know if the parameter influences
121-
the file name, we can use taint tracking instead of data flow.
122-
This query finds calls to ``os.open`` where the filename is derived from a parameter:
143+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/3998032643497238063/>`__. Very few hits now; these could feasibly be inspected manually.
144+
145+
Using the exact name supplied via the parameter may be too strict. If we want to know if the parameter influences the file name, we can use taint tracking instead of data flow. This query finds calls to ``os.open`` where the filename is derived from a parameter:
123146

124147
.. code-block:: ql
125148
@@ -129,9 +152,11 @@ This query finds calls to ``os.open`` where the filename is derived from a param
129152
130153
from DataFlow::CallCfgNode call, DataFlow::ParameterNode p
131154
where
132-
call = API::moduleImport("os").getMember("open").getACall()
133-
and TaintTracking::localTaint(p, call.getArg(0))
134-
select p, "Opening a file based on a parameter."
155+
call = API::moduleImport("os").getMember("open").getACall() and
156+
TaintTracking::localTaint(p, call.getArg(0))
157+
select call, p
158+
159+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/2129957933670836953/>`__. Now we get more hits and in more projects.
135160

136161
Global data flow
137162
----------------
@@ -261,7 +286,7 @@ This data flow configuration tracks data flow from environment variables to open
261286
import semmle.python.ApiGraphs
262287
263288
class EnvironmentToFileConfiguration extends DataFlow::Configuration {
264-
EnvironmentToFileConfiguration() { this = "Environment opening files" }
289+
EnvironmentToFileConfiguration() { this = "EnvironmentToFileConfiguration" }
265290
266291
override predicate isSource(DataFlow::Node source) {
267292
source = API::moduleImport("os").getMember("getenv").getACall()
@@ -277,8 +302,11 @@ This data flow configuration tracks data flow from environment variables to open
277302
278303
from Expr environment, Expr fileOpen, EnvironmentToFileConfiguration config
279304
where config.hasFlow(DataFlow::exprNode(environment), DataFlow::exprNode(fileOpen))
280-
select fileOpen, "This 'File.Open' uses data from $@.",
281-
environment, "call to 'GetEnvironmentVariable'"
305+
select fileOpen, "This call to 'os.open' uses data from $@.",
306+
environment, "call to 'os.getenv'"
307+
308+
➤ `Running this in the query console on LGTM.com <https://lgtm.com/query/6582374907796191895/>`__ unsurprisingly yields no results in the demo projects.
309+
282310

283311
Further reading
284312
---------------

0 commit comments

Comments
 (0)