Caffe digits for windows

GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.

DIGITS / docs / BuildDigitsWindows.md

Build DIGITS on Windows

DIGITS for Windows depends on Windows branch of BVLC Caffe. The following layers, required for DetectNet feature, are not implemented in that branch.

As a result, DIGITS for Windows does not support DetectNet. To run DIGITS with DetectNet, please use NV-Caffe 0.15 or above on Ubuntu.

Download and install Python 2.7.11 64bit from Python’s official site (https://www.python.org/ftp/python/2.7.11/python-2.7.11.amd64.msi). Please select Add Python Path during installation.

Download numpy, scipy, matplotlib, scikit-image, h5py from Unofficial Windows Binaries for Python Extension Packages webpage at (http://www.lfd.uci.edu/

gohlke/pythonlibs/). Remember to download correct version (2.7) and architecture (64-bit).

Additionally, download gevent v1.0.2 at the same site. Run command prompt (cmd.exe) as administrator, and issue the following commands.

If the installation process complains compiler not found, you need to install Microsoft Visual C++ Compiler for Python 2.7, downloaded at (https://www.microsoft.com/en-us/download/details.aspx?id=44266). We recommend installing it by

After that compiler is installed, finish the above python -m pip install commands.

At this moment, do not install gevent yet. We need to install it after installing DIGITS.

CUDA 7.5 can be obtained at NVIDIA CUDA (https://developer.nvidia.com/cuda-downloads). Please select Windows 7 to download.

Download CuDNN 5.1 at NVIDIA website (https://developer.nvidia.com/cudnn). Please select CuDNN 5.1 for CUDA 7.5.

Caffe can be obtained at (https://github.com/bvlc/caffe/tree/windows). Note you need to install Visual Studio 2013 to build Caffe. Before building it, enable Python support, CUDA and CuDNN by following instructions on the same page. Because we are using Official CPython, please change the value of PythonDir tag from C:\Miniconda2\ to C:\PYTHON27\ (assume your CPython installation is the default C:\PYTHON27). After building it, configure your Python environment to include pycaffe, which is described at (https://github.com/bvlc/caffe/tree/windows#remark). Your caffe.exe will be inside Build\x64\Release directory (if you made release build).

Graphviz is available at (www.graphviz.org/Download.php). Please note this site is not always available online. The installation directory can not contain space, so don’t install it under the regular ‘c:\Program Files (x86)’ directory. Try something like ‘c:\graphviz’ instead. When the installation directory contains space, pydot could not launch the dot.exe file, even it has no problem finding it. Add the c:\graphviz\bin directory to your PATH.

Clone DIGITS from github.com (https://github.com/nvidia/digits). From the command prompt (run as administrator) and cd to DIGITS directory. Then type

You may see error about Pillow, like ValueError: jpeg is required unless explicitly disabled using —disable-jpeg, aborting If this happens, download Pillow Windows Installer (Pillow-3.1.1.win-amd64-py2.7.exe) at https://pypi.python.org/pypi/Pillow/3.1.1 and run the exectuables. After installing Pillow in the above way, run

After the above command, check if all required Python dependencies are met by comparing requirements.txt and output of the following command.

If gevent is not v1.0.2, install it from the whl file, downloaded previously from (http://www.lfd.uci.edu/

It should uninstall the gevent you had, and install gevent 1.0.2.

Because readline is not available in Windows, you need to install one additional Python package.

First, check if caffe executable is included in your PATH environment variable. If not, add it.

Replace MY_CAFFE_ROOT with your local caffe directory.

Launch DIGITS devserver with the following command:

Point your browser to localhost:5000. You should be able to see DIGITS.

DIGITS crashes when trying to classify images with ** Show visualizations and statistics **

This issue should have been resolved. However, if you still encounter this issue, this seems related to different hdf5 DLL binding between pycaffe and h5py. The DLL used by pycaffe was pulled from nuget, and its version is 1.8.15.2. Slightly older than the DLL in h5py. A temporary solution is to load h5py before pycaffe. To force loading h5py before pycaffe, you can either add one line at the beginning of digits-devserver file, or import h5py just before import caffe in digits/config/caffe_option.py.

import readline causes ImportError

Change import readline in digits\config\prompt.py to

DIGITS complains Torch binary not found in PATH

Currently, DIGITS does not support Torch on Windows platform.

Источник

Caffe digits for windows

This guide provides an overview on using DIGITS with Caffe. NVCaffe is an NVIDIA-maintained fork of BVLC Caffe tuned for NVIDIA GPUs, particularly in multi-GPU configurations, accelerated by the NVIDIA Deep Learning SDK. It includes multi-precision support as well as other NVIDIA-enhanced features and offers performance specially tuned for NVIDIA GPU systems.

1. Overview Of DIGITS

The Deep Learning GPU Training Systemв„ў (DIGITS) puts the power of deep learning into the hands of engineers and data scientists.

DIGITS is not a framework. DIGITS is a wrapper for NVCaffeв„ў and TensorFlowв„ў ; which provides a graphical web interface to those frameworks rather than dealing with them directly on the command-line.

DIGITS can be used to rapidly train highly accurate deep neural network (DNNs) for image classification, segmentation, object detection tasks, and more. DIGITS simplifies common deep learning tasks such as managing data, designing and training neural networks on multi-GPU systems, monitoring performance in real time with advanced visualizations, and selecting the best performing model from the results browser for deployment. DIGITS is completely interactive so that data scientists can focus on designing and training networks rather than programming and debugging.

2. Limitations

DIGITS for Windows depends on Windows branch of BVLC Caffe. The following layers, required for DetectNet feature, are not implemented in that branch.

As a result, DIGITS for Windows does not support DetectNet. To run DIGITS with DetectNet, please use NV-Caffe 0.15 or above on Ubuntu.

3. Prerequisites

Prerequisites for using Caffe with DIGITS include:

Installing Prerequisites

Follow these instructions to install the required software components.

Python 2

Download and install Python 2.7.11 64bit from Python’s official site ( https://www.python.org/ftp/python/2.7.11/python-2.7.11.amd64.msi). Please select Add Python Path during installation.
Download numpy, scipy, matplotlib, scikit-image, h5py from Unofficial Windows Binaries for Python Extension Packages webpage at ( http://www.lfd.uci.edu/

gohlke/pythonlibs/). Remember to download correct version (2.7) and architecture (64-bit).

Additionally, download gevent v1.0.2 at the same site. Run command prompt (cmd.exe) as administrator, and issue the following commands:

If the installation process complains compiler not found, you need to install Microsoft Visual C++ Compiler for Python 2.7, available at https://www.microsoft.com/en-us/download/details.aspx?id=44266. Nvidia recommend installing it using:

msiexec /i VCForPython27.msi ALLUSERS=1

After that compiler is installed, finish the above python -m pip install commands. Install gevent after installing DIGITS .

CUDA 7.5

Go to NVIDIA CUDA ( https://developer.nvidia.com/cuda-downloads) to get CUDA 7.5. Please select Windows 7 to download.

cuDNN 5.1

Go to the NVIDIA cuDNN website ( https://developer.nvidia.com/cudnn) to download CuDNN 5.1. Please select CuDNN 5.1 for CUDA 7.5.

Caffe

You can get Caffe at ( https://github.com/bvlc/caffe/tree/windows). Note you need to install Visual Studio 2013 to build Caffe. Before building it, enable Python support, CUDA and CuDNN by following instructions on the same page. Because we are using Official CPython, please change the value of PythonDir tag from C:\Miniconda2\ to C:\PYTHON27\ (assume your CPython installation is the default C:\PYTHON27). After building it, configure your Python environment to include pycaffe, which is described at ( https://github.com/bvlc/caffe/tree/windows#remark). Your caffe.exe will be inside Build\x64\Release directory (if you made release build).

Graphviz

Graphviz is available at ( www.graphviz.org/Download.php). Please note this site is not always available online. The installation directory can not contain a space, so don’t install it under the c:\Program Files (x86) directory. Instead use a directory name like c:\graphviz . When the installation directory contains space, pydot could not launch the dot.exe file, even it has no problem finding it. Add the c:\graphviz\bin directory to your PATH.

Installing DIGITS

Clone DIGITS from github.com ( https://github.com/nvidia/digits). Run as administrator, from the command prompt, cd to the DIGITS directory. Enter:

python -m pip install -r requirements.txt

After the above command, check if all required Python dependencies are met by comparing requirements.txt and output of the following command:

If gevent is not v1.0.2, install it from the whl file, downloaded previously from ( http://www.lfd.uci.edu/

python -m pip install gevent-1.0.2-cp27-none-win_amd64.whl

It should uninstall the gevent you had, and install gevent 1.0.2.

Because readline is not available in Windows, you need to install one additional Python package.

python -m pip install pyreadline

Running DIGITS

Add the Caffe executable if it is not included in your PATH environment variable.

Replace MY_CAFFE_ROOT with your local Caffe directory. Launch DIGITS devserver using the following command:

Use your browser to navigate to localhost:5000 . You should be able to see DIGITS .

Troubleshooting

For troubleshooting tips see the Nvidia DIGITS Troubleshooting and Support Guide.

DIGITS crashes when trying to classify images

Import readline causes ImportError

Change import readline in digits\config\prompt.py to

Support

Notices

Notice

THE INFORMATION IN THIS GUIDE AND ALL OTHER INFORMATION CONTAINED IN NVIDIA DOCUMENTATION REFERENCED IN THIS GUIDE IS PROVIDED вЂњAS IS.вЂќ NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE INFORMATION FOR THE PRODUCT, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIAвЂ™s aggregate and cumulative liability towards customer for the product described in this guide shall be limited in accordance with the NVIDIA terms and conditions of sale for the product.

THE NVIDIA PRODUCT DESCRIBED IN THIS GUIDE IS NOT FAULT TOLERANT AND IS NOT DESIGNED, MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN, CONSTRUCTION, MAINTENANCE, AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A FAILURE OF SUCH SYSTEM COULD RESULT IN A SITUATION THAT THREATENS THE SAFETY OF HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE (INCLUDING, FOR EXAMPLE, USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER LIFE CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY OF FITNESS FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR IN PART, FOR ANY CLAIMS OR DAMAGES ARISING FROM SUCH HIGH RISK USES.

NVIDIA makes no representation or warranty that the product described in this guide will be suitable for any specified use without further testing or modification. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customerвЂ™s sole responsibility to ensure the product is suitable and fit for the application planned by customer and to do the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customerвЂ™s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this guide. NVIDIA does not accept any liability related to any default, damage, costs or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this guide, or (ii) customer product designs.

Other than the right for customer to use the information in this guide with the product, no other license, either expressed or implied, is hereby granted by NVIDIA under this guide. Reproduction of information in this guide is permissible only if reproduction is approved by NVIDIA in writing, is reproduced without alteration, and is accompanied by all associated conditions, limitations, and notices.

Trademarks

NVIDIA, the NVIDIA logo, and cuBLAS, CUDA, cuDNN, DALI, DIGITS, DGX, DGX-1, DGX-2, DGX Station, DLProf, Jetson, Kepler, Maxwell, NCCL, Nsight Compute, Nsight Systems, NvCaffe, PerfWorks, Pascal, SDK Manager, Tegra, TensorRT, TensorRT Inference Server, Tesla, TF-TRT, and Volta are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.

Copyright

Источник

Почему супер-мега-про машинного обучения за 15 минут всё же не стать

Вчера я опубликовал статью про машинное обучение и NVIDIA DIGITS. Как и обещал, сегодняшняя статья — почему всё не так уж и хорошо + пример выделения объектов в кадре на DIGITS.

NVIDIA подняла волну пиара по поводу разработанной и имплиментированной в DIGITS сетки DetectNet. Сетка позиционируется как решение для поиска одинаковых/похожих объектов на изображении.

Что это такое

В начале года я несколько раз упоминал про забавную сетку Yolo. В целом, весь народ, с которым я общался, отнеслись к ней скорее негативно, со словами, что Faster-RCNN куда быстрее и проще. Но, инженеры NVIDIA ею вдохновились и собрали свою сетку на Caffe, назвав её DetectNet.
Принцип сетки такой же как и в Yolo. Выходом сети для изображения (N*a*N*a) является массив N*N*5, в котором для каждого региона исходного изображения размером a*a вводиться 5 параметров: наличие объекта и его размер:

Быстро считает. У меня получалось по 10-20ms на кадр. В то время, когда Faster-RCNN тратил по 100-150.
Просто обучается и настраивается. С Faster-RCNN нужно было долго возиться.

Минус один: есть решения с более качественным детектированием.

Общие слова, перед тем как начну рассказ

В отличие от распознавания категорий, про которое я писал вчера, детектирование объектов сделано плохо. Не user friendly. Большая часть статьи будет на тему того, как всё же это чудо запустить. К сожалению, такой подход убивает изначальную идею DIGITS, что можно сделать что-то не разбираясь в логике системы и её математике.
Но если всё же запустили — пользоваться удобно.

Что будем распознавать

Пару лет назад у нас была совсем безумная затея с автомобильными номерами. Которая вылилась в целую серию статей по ней. В том числе была порядочная база фотографий, которую мы выложили.

Я решил воспользоваться частью наработок и подетектировать номера через DIGITS. Так что их-то и будем использовать.

База размеченная нужным образом у меня была совсем маленькая, под другие цели. Но обучить хватило.

Поехали

Выбрав в главном меню «New Dataset->Images->Object Detection» мы попадаем в меню создания датасета. Здесь нужно обязательно указать:

Training image folder — папку с изображениями
Training label folder — папку с текстовичками-подписями к изображениям
Validation image folder — папку с изображениями для проверки
Validation label folder — папку с текстовичками-подписями к ним
Pad image — Если изображение меньше указанного тут, то оно будет дополнено чёрным фоном. Если больше — создание базы упадёт ¯ \ _ (ツ) _ / ¯
Resize image — к какому размеру ресайзнуть изображение
Minimum box size — лучше всего установить это значение. Это минимальный размер объекта при валидации

Тут есть сложность. Как делать текстовик-подпись к изображению с его описанием? Пример на ГитХабе от NVIDIA в официальном репозитории DIGITS скромно об этом умалчивает, упоминая лишь, что он такой же, как в датасете kitti. Меня несколько удивил такой подход к пользователям готового из коробки фреймворка. Но ок. Пошёл, скачал базу и доки к ней, прочитал. Формат файла:

Естественно, большая часть параметров тут не нужна. Реально можно оставить только параметр «bbox», остальное всё равно не будет использоваться.

Как выяснилось позже, для DIGITS был ещё второй тьюториал, где формат файла всё же подписывался. Но был он не в репозитории DIGITS ¯ \ _ (ツ) _ / ¯

Там подтверждено, что мои догадки о том, что нужно использовать были верны:

Начинаем обучать

Класс. База сделана, Начинаем обучать. Для обучения нужно выставить такие же настройки, как указанные в примере:

Subtract Mean в None
base learning rate в 0.0001
ADAM solver
Выбрать вашу базу
Выбрать вкладку «Custom Network». Скопировать в неё текст из файла «/caffe-caffe-0.15/examples/kitti/detectnet_network.prototxt» (это в форке caffe от nvidia, понятно).
Так же, рекомендуется скачать предварительно натренированную модель GoogleNet вот тут. Указать её в «Pretrained model(s)»

Так же, я сделал следующее. Для скопированной сетки «detectnet_network.prototxt» все значения размера изображения «1248, 352» я заменил на размеры изображений из своей базы. Без этого обучение падало. Ну, естественно, ни в одном тьюторивале этого нет… ¯ \ _ (ツ) _ / ¯

График Loss падает, обучение пошло. Но… График точности стоит на нуле. Что такое?!
Ни один из двух тьюториалов которые я нашел не отвечал на этот вопрос. Пошёл копаться в описание сетки. Где копаться, было понятно сразу. Раз падают loss — обучение идёт. Ошибка в validation пайплайне. И действительно. В конфигурации сети есть блок:

Выглядит подозрительно. Открыв описание слоя clustering можно найти комментарий:

Становится понятно, что это пороги. Зарандомил там 3 числа не вникая в суть. Обучение пошло + начал расти validation. Часов за 5 достиг каких-то разумных порогов.

Но вот облом. При успешном обучении 100% картинок не распонзавалось. Пришлось копаться и разбираться, что этот слой значит.

Слой реализует сбор полученных гипотез в единое решение. Как основной инструмент тут применяется OpenCV модуль «cv.groupRectangles». Это функция, которая ассоциирует группы прямоугольников в один прямоугольник. Как вы помните, у сети такая структура, что в окрестности объекта — должно быть много срабатываний. Их нужно собрать в единое решение. У алгоритма сбора есть куча параметров.

gridbox_cvg_threshold (0.05) — порог детектирования объекта. По сути достоверность того, что мы нашли номер. Чем меньше — тем больше детекций.
gridbox_rect_threshold (1) — сколько детекторов должно сработать, чтобы было принято решение «есть номер»
gridbox_rect_eps (0.02) — во сколько раз могут отличаться размеры прямоугольников, чтобы объединить их в одну гипотезу
min_height — минимальная высота объекта

Теперь их достаточно просто подобрать, чтобы всё заработало. А теперь юмор. Таки был ещё и третий тьюториал, где часть всего этого дела описана.
Но не вся ¯ \ _ (ツ) _ / ¯

Что в итоге

В итоге можно посмотреть что сетка выделила:

1500 кадров) — даёт о себе знать. В базе не учли грязные номера => они не детектируются. В базе не учли сильную перспективу номера => они не детектируются. Не учли слишком крупные/слишком мелкие. Ну, вы поняли. Короче нужно не полениться и разметить тысяч 5 номеров нормально.

При распознавании можно посмотреть прикольные картинки с картами активации (1,2,3). Видно, что на каждом следующем уровне номер виден всё чётче и чётче.

Как запустить

20 строчек. И это будет готовый детектор номеров:

Вот тут вот я выложил деплой файл для сетки и веса обученой сети, если кому надо.

Источник

Caffe digits for windows

Caffe digits for windows

DIGITS / docs / BuildDigitsWindows.md

Caffe digits for windows

1. Overview Of DIGITS

2. Limitations

3. Prerequisites

Installing Prerequisites

Python 2

CUDA 7.5

cuDNN 5.1

Caffe

Graphviz

Installing DIGITS

Running DIGITS

Troubleshooting

DIGITS crashes when trying to classify images

Import readline causes ImportError

Support

Notices

Notice

Trademarks

Copyright

Почему супер-мега-про машинного обучения за 15 минут всё же не стать

Что это такое

Общие слова, перед тем как начну рассказ

Что будем распознавать

Поехали

Начинаем обучать

Что в итоге

Как запустить

Добавить комментарий Отменить ответ