Добрый день. Возник вопрос. При парсинге с сайта поля данных получаю обрезанными. Есть какие то ограничения на длину (размер) атрибута?
Вот код который я парсю:
<div id="content_features" class="wysiwyg-content" style="display: block;">
<h2 class="subheader">
Основные
</h2>
<div class="form-field" style="overflow:hidden;">
<div style="width:350px;padding:12px 10px;color:#404040;float:left;">
Бренд:
</div>
<div style="width:290px;padding:12px 10px;color:#404040;float:left;overflow:hidden;">
TOSHIBA
</div>
</div>
<div class="form-field" style="overflow:hidden;">
<div style="width:350px;padding:12px 10px;color:#404040;float:left;">
Емкость:
</div>
<div style="width:290px;padding:12px 10px;color:#404040;float:left;overflow:hidden;">
4400mAh
</div>
</div>
<div class="form-field" style="overflow:hidden;">
<div style="width:350px;padding:12px 10px;color:#404040;float:left;">
Напряжение:
</div>
<div style="width:290px;padding:12px 10px;color:#404040;float:left;overflow:hidden;">
10.8V
</div>
</div>
<div class="form-field" style="overflow:hidden;">
<div style="width:350px;padding:12px 10px;color:#404040;float:left;">
Цвет:
</div>
<div style="width:290px;padding:12px 10px;color:#404040;float:left;overflow:hidden;">
черный
</div>
</div>
<h3>Совместимости</h3>
<div class="no_selects" style="-webkit-user-select: none;">
<p style="font-size:16px;color:black;">Совместимые модели ноутбуков:</p>
<div align="left">
<div style="width:160px;height:30px;border:0px;float:left;">
Equium P200 Series
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Equium P200-178
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Equium P200-1ED
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite L350 Series
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite L350-ST2121
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite L350-ST2701
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite L355 Series
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite L355-S7811
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite L355-S7812
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite L355-S7831
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite L355D Series
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200 Series
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-10A
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-10C
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-10G
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-10O
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-10T
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-11P
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-123
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-12U
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-12V
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-12W
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-136
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-139
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-13B
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-13F
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-13H
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-13I
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-13K
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-13M
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-13Y
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-13Z
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-140
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-144
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-14O
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-154
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-155
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-156
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-157
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-15U
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-16J
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-16W
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-16X
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-177
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-17B
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-17C
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-18C
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-18E
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-195
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1B6
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1BK
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1BY
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1C2
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1C7
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1CB
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1D0
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1DE
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1E7
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1E9
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1EA
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1EE
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1F5
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1FC
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1FT
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1FY
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1FZ
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1G2
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1G4
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1G7
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-1G8
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-ST2061
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200-ST2071
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D Series
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-107
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-108
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-10A
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-10L
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-10P
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-111
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-11G
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-11J
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-11M
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-11R
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-1FI
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P200D-1FW
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205 Series
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6237
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6247
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6257
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6267
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6277
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6287
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6297
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6298
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6307
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6327
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6337
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6347
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S6348
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S7469
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S7476
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S7482
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205-S7484
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205D Series
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205D-S7429
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205D-S7436
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205D-S7438
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205D-S7439
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205D-S7454
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P205D-S7479
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P300 Series
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P300-ST3014
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305 Series
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305-S8814
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305-S8820
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305-S8822
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305-S8823
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305-S8824
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305-S8825
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305-S8826
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305-S8854
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305D Series
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305D-S8816
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305D-S8818
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite P305D-S8819
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro L350 Series
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro L350-S1701
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro P200 Series
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro P200-14W
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro P200-150
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro P200-15E
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro P200-19R
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro P200HD-1DT
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro P300-14P
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro P300-14Q
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro P300-14R
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro P300-14S
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite Pro P300-16V
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200 Series
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-203
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-20C
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-20F
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-20G
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-20J
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-20O
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-20S
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-213
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-214
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-219
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-21D
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-21E
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-21F
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-21L
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-21P
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-21U
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-21V
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-21W
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-21X
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X200-21Y
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X205 Series
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X205-S7483
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X205-S9349
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X205-S9359
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X205-S9810
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X205-SLi1
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X205-SLi2
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X205-SLi3
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X205-SLi4
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X205-SLi5
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
Satellite X205-SLi6
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
</div>
</div>
</div>
<div style="clear:both;"></div>
<p style="font-size:16px;color:black;">Совместимые парт номера:</p>
<div class="no_selects" align="left" style="-webkit-user-select: none;">
<div style="width:160px;height:30px;border:0px;float:left;">
PA3536U-1BRS
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
PA3537U-1BAS
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
PA3537U-1BRS
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
PABAS100
</div>
<div style="clear:both;"></div>
<div style="width:160px;height:30px;border:0px;float:left;">
PABAS101
</div>
<div style="width:160px;height:30px;border:0px;float:left;">
</div>
</div>
</div>
Параметры парсинга:
Текст начала парсинга: Совместимости
Номер колонки, содержащей значение атрибута или параметры парсинга: color:black;">,</p>,left;">,id="content_compatibility"
После парсинга я получаю 2 атрибута:
Совместимые модели ноутбуков
Совместимые парт номера
В первом я получаю лишь 12 кусков (дивов), а во втором пусто. Такое впечатление что модуль дальше просто не идет.
Таково же рода проблема и с наименованием. Код для парсинга:
<h1 class="mainbox-title" style="text-align: left;">Батарея для ноутбука Toshiba PA3536 (Equium P200 Series, Satellite: L350 Series, L355 Series, L355D Series, P200 Series) 10.8V 4400mAh Black</h1>
Параметры парсинга :
текста, c которого начается парсинг:mainbox-title
Номер колонки, содержащей названиетовара или параметры парсинга:">,</h1>
В результате парсинга получаю наименование такого вида: Батарея для ноутбука Toshiba PA3536 (Equium P200 Series, Satellite: L350 Series, L355 Series, L355D Series, P2
Подскажите в чем может быть проблема и есть ли действительно ограничения на поля парсинга.