
NCollector Studio can be operated to its full potential using the console-mode. This mode has been optimized to provide the user with maximum performance and flexibility. The console-mode is recommended only for experienced users.
It can be started either by clicking the “NCollector Studio Console”-shortcut
located in the Start-menu, or by manually locating it using the command-line.
After clicking the icon you will be taken to the folder in which the program has
been installed. From here you can run the NCollector Studio command-line tool using
the options described in the following chapters.

Please note that project files (*.wrp) created in the NCollector Studio UI can also be loaded from the command-line and vice versa.
An overview of the available command-line options can be seen at any time typing the command:
ncconsole.exe /?
| Parameter | Description | Example |
|---|---|---|
|
-mirror |
Downloads all links found, including HTML-pages. |
-mirror |
|
-offlinebrowse |
Downloads all links found, including HTML-pages. Translates links to local relative paths after download completes. |
-offlinebrowse |
|
-filerip |
Downloads only links with a given extension. |
-filerip |
|
-url |
The address to start ripping from. Can be used multiple times. Must include protocol prefix (http://, https://) |
-url=http://NCollector Studio.org |
|
-domainonly |
Crawler will only scan and download links from the domains (including sub domains) specified as start addresses. |
-domainonly |
|
-subdomainonly |
Crawler will only scan and download links from the domains (not including other sub domains) specified as start addresses. |
-subdomainonly |
|
-folderonly |
Crawler will not download links higher than the folder specified in the start addresses. |
-folderonly |
|
-levels |
The number of levels that crawler will scan for links. Levels can be thought of as how many links it would take to reach a page if browsed in the web-browser. Default is one level. |
-levels=2 |
|
-maxpages |
The number of pages to scan for links. Default is unlimited. |
-maxpages=100 |
|
-userobots.txt |
Tells NCollector Studio to follow the rules specified in the servers robots.txt file. This option might cause fewer links to be found, and eventually slow down ripping. |
-userobotstxt |
|
-name |
The name of the project. When a name is specified the downloaded files will be put in a folder with this name. Caching function will also prevent files that have not been changed from being downloaded when running the same project multiple times. |
-name=MyRipProject |
|
-save |
Will save a project-template to disk for reuse later on. All settings will be saved when this option is specified. |
-save=MyRipProject.wrp |
|
-load |
Instructs NCollector Studio to load a previously saved project template. No other options can be used together with this option. |
-load=MyRipProject.wrp |
|
-noreport |
Will instruct NCollector Studio to skip the generation of the post-session HTML-report. |
-noreport |
|
-keyword |
Instructs NCollector Studio to include or exclude links with given keywords in the address. The keyword filter supports a number of options which should precede the keyword option. The options are: include|exclude complete|withoutfilename|onlyfilename downloaded|spidered|all See below for detailed descriptions. |
-keyword=ferrari |
|
/include |
Links with the given keyword should be included. The default. |
-keyword= ferrari /include |
|
/exclude |
Links with the given keyword should be excluded. |
-keyword= ferrari /exclude |
|
/complete |
The complete URL should be checked when parsing for the given keyword. The default. |
-keyword= ferrari /complete |
|
/withoutfilename |
The URL excluding the filename should be checked when parsing for the given keyword. |
-keyword= ferrari /withoutfilename |
|
/onlyfilename |
Only the filename should be checked when parsing for the given keyword. |
-keyword= ferrari /onlyfilename |
|
/downloaded |
Only downloaded files should be checked for keyword. The default. |
-keyword= ferrari /downloaded |
|
/spidered |
Only spidered files should be checked for keyword. |
-keyword= ferrari /spidered |
|
/all |
All files should be checked for keyword. |
-keyword= ferrari /all |
| Parameter | Description | Example |
|---|---|---|
|
-extension |
File type to rip. Only applicable in filerip-mode. Can be used multiple times. At least one is mandatory. |
-extension=.jpg |
|
-minsize |
The minimum size in kilobytes of the files downloaded in file rip mode. |
-minsize=100 |
|
-maxsize |
The maximum size in kilobytes of the files downloaded in file rip mode. |
-maxsize=1000 |
|
-minwidth |
The minimum width in pixels of images downloaded in file rip mode. |
-minwidth=640 |
|
-minheight |
The minimum height in pixels of images downloaded in file rip mode. |
-minheight=480 |
|
-maxwidth |
The maximum width in pixels of images downloaded in file rip mode. |
-maxwidth=1024 |
|
-maxheight |
The maximum height in pixels of images downloaded in file rip mode. |
-maxheight=768 |
Scenario: We want to rip all JPEG-images with a minimum width of 300 pixels and minimum height of 200 pixels. The spider should search for links 2 levels deep from the start-address.
Command-line:
ncconsole.exe –filerip –url=http://calluna-software.com –levels=2 –extension=.jpg
–minwidth=300 –minheight=200
Scenario: We want to rip all MPEG-movies with a minimum size of 1mb, and maximum size of 10mb. The spider should scan maximum 100 pages for links.
Command-line:
ncconsole.exe –filerip –url=http://movies.com –maxpages=100 –extension=mpg
–minsize=1000 –maxsize=10000
Scenario: We want to be able to browse a site without an internet-connection. We want to download two levels, and restrict the spider to only follow links within the specified subdomain.
Command-line:
ncconsole.exe –offlinebrowse –url=http://galleries.photo.com –levels=2
–subdomainonly
Scenario: We want to make a copy of the web pages and files on the servers in a domain. Links should not be translated. The project should be saved so that we can easily start it again another time.
Command-line:
ncconsole.exe –mirror –url=http://www.myblogs.com –maxpages=1000 –levels=9
–domainonly –save=myblogs.wrp –name=myblogs
Scenario: We want to run the job in the previous example again.
Command-line:
ncconsole.exe –load=myblogs.wrp