Analyzer
ScanPyImports.analyzer
¶
A module for processing and analyzing data on imported modules.
Classes:
-
Data
–Construct the base DataFrame of the import statement data.
-
DataAnalyzer
–Processe and analyze the data on imported modules.
Classes¶
Data(path)
¶
Construct the base DataFrame of the import statement data.
Parameters:
-
path
(str
) –Path to the directory.
Source code in ScanPyImports/analyzer.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
Attributes¶
path: str
instance-attribute
¶
Path to the directory.
df: Optional[pd.DataFrame]
property
¶
DataFrame containing import data or None if the directory does not exist.
DataFrame content
The DataFrame contains the following columns:
imported_0
,imported_1
, ... representing the imported packages and modules.- where
imported_0
represents the top-level package or library being imported. imported_1
represents the module or submodule within the package being imported.- and so on
imported_2
represent further nested modules, submodules if present in the import statement.
- where
original
: The original line of text containing the import statement.alias
: Alias (if any) of the submodule.path
: Full path of the file containing the import.file
: File name.filename
: File name without extension.extension
: File extension.directory
: Directory path of the file.
The data creation takes place in this private method. One could modify this code to retreive a dictionary or a JSON file instead of a DataFrame.
Functions¶
DataAnalyzer(path, to_exclude=None)
¶
Bases: Data
A class to process the data on imported modules.
Parameters:
-
path
(str
) –Path to the directory.
-
to_exclude
(List[str]
, default:None
) –List of packages' names to exclude from the analysis.
Methods:
-
get_frequencies
–Return frequency of imported modules.
Source code in ScanPyImports/analyzer.py
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
|
Attributes¶
path: str
instance-attribute
¶
Path to the directory.
df: Optional[pd.DataFrame]
property
¶
DataFrame containing import data or None if the directory does not exist.
DataFrame content
The DataFrame contains the following columns:
imported_0
,imported_1
, ... representing the imported packages and modules.- where
imported_0
represents the top-level package or library being imported. imported_1
represents the module or submodule within the package being imported.- and so on
imported_2
represent further nested modules, submodules if present in the import statement.
- where
original
: The original line of text containing the import statement.alias
: Alias (if any) of the submodule.path
: Full path of the file containing the import.file
: File name.filename
: File name without extension.extension
: File extension.directory
: Directory path of the file.
The data creation takes place in this private method. One could modify this code to retreive a dictionary or a JSON file instead of a DataFrame.
to_exclude: List[str]
instance-attribute
¶
List of packages' names to exclude from the analysis.
own_processed_df: pd.DataFrame
property
¶
A copy of the DataFrame (df) after processing own-created modules.
Own-created modules
Own-created modules are defined as Python scripts that are imported as modules and reside in the same folder as the script containing the import statement.
A natural extension would be to also include own-created packages residing in the same folder as the .py or .ipynb file where the import statment resides.
In the returned DataFrame, own-created modules are dropped and replaced by the import statements residing inside the own-created module script, provided they relate to external libraries.
Functions¶
get_frequencies(exclude=True, process_own_modules=True)
¶
Get the frequency of imported modules.
Parameters:
-
exclude
(bool
, default:True
) –Whether to exclude the packages listed in to_exclude.
-
process_own_modules
(bool
, default:True
) –Whether to process own-created modules.
Returns:
-
Series
–pd.Series: Series of import frequencies sorted in descending order.
Source code in ScanPyImports/analyzer.py
248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 |
|