Command Line

Usage

XTractor myEmailATI myPasswordATI inputFile [-s]

 

myEmailATI

eMail used to log into AT Internet

myPasswordATI

AT Internet password

inputFile

Type and content depends on mode:

Data Extraction mode when this is a JSON file containing all settings: a set of Data Queries, for a set of Periods and a set of Sites ID, completed with a set of options

Sites Update mode when this is a CSV file containing a set of Sites ID. As result, it generates one CSV files with Sites ID, Sites names and all custom variables for each of them. This file can be directly used by Data Extraction mode configurations.

-s

Optional silent mode. It is useful to setup scheduled extractions with crons.

 

Data Extraction mode

Syntax

Console mode

XTractor myEmailATI myPasswordATI myDataset.json

Silent mode

XTractor myEmailATI myPasswordATI myDataset.json -s

Example

XTractor john.smith@mycompany.com myPwd12 webexperience.json

Results Files

CSV Files, Tab delimited: depending on MergeFiles option, one file per Data Query or one file per Data Query per Period, whatever the number of Sites.

One log file per dataset, created in a sub folder _logs. Example: ./_logs/myDataset.log

 

JSON Dataset file

{

  "Output":"dataTEST",

  "SubFolders":true

  "MergeFiles":false,

  "Overwrite":false,

  "SiteName2ID":true,

  "Periods":[["2018-01-01","2018-01-31"],["2018-02-01","2018-02-28"],["M-1","M0"]],    

  "Queries":

  {

    "list":["TrafficSE","Level2","EdmsDocuments"],

    "TrafficSE":

    {

    "query":"columns={d_time_year,d_time_month,d_site,cl_27601,m_visits,m_page_views,m_visitors,m_bq}&segment=100025834&max-results=2",

      "filter":"SE"

    },

    "Level2":

    {

      "query":"columns={d_time_year,d_time_month,d_site,d_l2,m_visits}&sort={-m_visits}&max-results=20"

    },

    "EdmsDocuments":

    {

    "query":"columns={d_site,d_time_year,d_time_month,d_click_l2,d_click,d_click_chap1,d_click_chap2,d_click_chap3,m_clicks}&sort={-m_clicks}&filter={$OR:{d_click_l2:{$eq:'product+offer'},d_click_l2:{$eq:'download+center'},d_click_l2:{$eq:'utility'}},$AND:{d_click_chap1:{$neq:'product'},d_click_chap1:{$neq:'undefined'}},d_click_type:{$eq:'download'}}&space={s:225304}&period={M:'2018-01'}&max-results=50&page-num=*"

    }   

  },

  "Sites":

  {

    "file":"all.csv"

  }

}

 

Output

Optional - Directory path

Contains the relative or absolute directory path where sub folders and CSV files are generated.

If the directory doesn’t exist, it is created.

If missing, “data” is created by default, under XTractor directory

SubFolders

Optional - Boolean

false means that all files are directly generated in Output directory

true means that one sub folder is created per period.

If missing, true is used by default.

MergeFiles

Optional - Boolean

false means that a new file is generated for each period and each query (one sub-directory per period when SubFolders is true).

true means that only one file is generated per query, whatever the number of periods (one sub-directory when SubFolders is true).

If missing, false is used by default.

Overwrite

Optional - Boolean

false means that when result file already exists, result file name is completed with an unused index: …_01.csv, or …_02.csv, etc…

true means that if a file already exists with the same name, it is overwritten. When SubFolders is set to false with more than one period measured, result file name is appended with the begin date month to avoid each period to be overwritten by the subsequent one.

If missing, false is used by default.

SiteName2ID

Optional - Boolean

false means nothing is done on retrieved Site Names.

true means that, when d_site column is specified in query, its value (Site Name) is replaced by the corresponding Site ID in result files. It’s useful when results are embedded in a database where Site ID is the primary key (keeping in mind that Site Name can be sometimes changed).

If missing, false is used by default.

 

Periods

Mandatory - Array of Arrays of two strings

Contains list of periods: pairs of begin / end dates, using format YYYY-MM-DD. At least one period must be specified.

Can also contain relative periods. For example ["M-1"] or ["M-8","M-1"]. In case of pair, both must specify the same letter, and offset value of the first must be greater or equal than the second.

If missing, process is stopped.

Queries

Mandatory - Object containing:

list

Mandatory - Array of strings

Contains the list of query names selected for the measure. Each of them must then be defined, as below for myQueryName. Note that these names are used to build result files names, so you must avoid forbidden file names characters.

myQueryName

Mandatory – Can use hashtag - Object containing:

query

Mandatory - String

Directly paste here your Data Query URL copied from Data Query Designer (between double quotes), whatever the Site used to built it.

You can use it as is, or short it by starting from columns=..., for more readability.

If your Data Query doesn't contain d_site in columns, it is automatically added as first column, to ensure you'll be able to distinguish data coming from different Sites.

If your Data Query uses Site Custom Variables cl_xxxxx in columns parameter, they must be defined in the Sites file provided: it means that the Site used to build the Data Query must be part of the Sites file setup (see below), so that XTractor can deduct the corresponding variable name and make the mapping for all Sites. If you are not sure, you can replace in the Data Query each cl_xxxxx by the site custom variable name provided in this file (in this case, it must be done as well for all other parameters, like sort or filters).

Parameters order doesn’t matter. period parameter is ignored (it can be removed). space parameter can also be removed when it doesn’t contain L2 specification.

If you want to grab all results whatever the number of rows, replace page_num parameter value (usually 1) by *: ...&page_num=*. XTractor will count the number of rows to iterate all pages. In this case, max-results parameter will be ignored, it can be removed.

filter

Optional - String

Contains a filter, used in relation with Sites file list: each Site can be associated to a filter, and a query can refer to this filter. When both are defined, the query is executed if they match.

If missing or set as *, the query is executed for all Sites provided.

 

Sites

Mandatory - Object containing:

file

Mandatory - File path

Contains the relative or absolute file path of CSV file containing Sites list: use / as folder separator, do not end by /.

This file is normally produced by Site Update Mode. If produced by a different way, it must respect the following format:

·   Columns must be comma separated, without header.

·   First and second columns are mandatory: Site ID and Site Name. Third column is optional: Site filter (see query filter to understand usage).

·   Subsequent columns are optional: they are used to define Site Custom Variables: [name cl_xxxx]

 

Sites Update mode

General syntax

Console mode

XTractor myEmailATI myPasswordATI mySitesList.csv

Silent mode

XTractor myEmailATI myPasswordATI mySitesList.csv -s

Examples

XTractor john.smith@mycompany.com myPwd1234 sitesSE.csv -s

 

Input file

Input file must contain at least the list of SiteID for which you want to grab custom variables names and their dimension key values.

335301

335300

335299

It can also contain filters in third column: they will be reproduced in generated file.

335301,,SE

335300,,SE

335299,,SE

If it contains segments in fourth column, they must be prefixed by #, then they will be also reproduced in generated file

335299,,AU,#100091186

335303,,,#100091187

335305,,,#100091188

Fourth column and further can also contains custom dimensions that are not site custom variables, then that must be reproduced as well in generated file. In this case, they must be prefixed by !.

335299,,AU,#100091186,!SMType cl_183277,!VCType cl_183293

335303,,,#100091187,!SMType cl_183358,!VCType cl_183378

335305,,,#100091188,!SMType cl_183431,!VCType cl_183467

 

Result File

CSV File, comma delimiters, named as [original file name]-tmp.csv:

335301,SE Belgium (NL),SE,origin cl_27531,company cl_375898,audience cl_375899,revenue cl_375902,employee cl_375903,account cl_375904

335300,SE Belgium (FR),SE,origin cl_27530,company cl_375885,audience cl_375886,revenue cl_375889,employee cl_375890,account cl_375891

335299,SE Australia,SE,origin cl_27528,company cl_375591,audience cl_375592,revenue cl_375595,employee cl_375596,account cl_375597

A log file is also created in _logs directory: ./_logs/xtractor.log

·  All Sites Custom Variables names are converted in lower case, with spaces replaced by underscores.

·  If you’re not granted to extract data for a site, only it name will be retrieved.

·  You can change these names, with a global find & replace. Provided names are used as header names in Data Extraction mode. The only constraint is to provide the same name for each Site. Note that, doing so, you can also consolidate variables not recorded with exactly the same name in AT Internet. But in this case, the same modification should be done after each update.

·  Site Custom Variables positions doesn’t matter: if you see them in different columns for different Sites, no worry, it will work. XTractor doesn’t consolidate on variable position nor index, it just uses its provided name.

 

Error Level

Command Line module can return one the following error level codes:

ERROR LEVEL

Meaning

Action

0

No error

You can use the data: XTractor process has been completed successfully

1

Unknown error

AT Internet issue. Please contact support.

4

Bad arguments

Check your command line (cannot happen from UI module)

8

Missing of bad Dataset file

Check JSON file content (cannot happen from UI module)

16

Missing Sites group file

Check Sites file parameter in JSON file

17

Bad Sites group file

Check CSV file content

18

Bad Custom Variable

Check Custom variables usage in your queries (the code they use must be defined in Sites file).

32

Unexpected error

Check your command line

64

Bad credentials

Check your credentials and your rights to access API and/or requested data

65

Bad Query

Check your queries, see details in Query errors logs to know more about the issue

66

ATI Server not available

Retry again later. If your Queries are big, consider scheduling them at the best time.

67

Connection issue

Check your internet connection. If you are connected through a VPN, retry with a direct connection.

>1000

Data Query issue

See AT Internet error codes.

 

Architecture and scheduling

General design

 

Windows scheduling

 

Linux scheduling

 

Web services

 

Copyright © 2018, Denis Rousseau - XTagManager