{"id":123,"date":"2018-01-20T01:06:19","date_gmt":"2018-01-19T22:06:19","guid":{"rendered":"http:\/\/talhacelik.com.tr\/?p=123"},"modified":"2018-04-27T01:58:18","modified_gmt":"2018-04-26T22:58:18","slug":"pythonda-mechanize-ve-beautifulsoup-modulleri","status":"publish","type":"post","link":"https:\/\/talhacelik.com.tr\/index.php\/2018\/01\/20\/pythonda-mechanize-ve-beautifulsoup-modulleri\/","title":{"rendered":"Python&#8217;da Mechanize ve BeautifulSoup Mod\u00fclleri"},"content":{"rendered":"<p>\u00d6rnek olarak haz\u0131rlanm\u0131\u015f python scriptindeki kodlar\u0131 g\u00f6rmeden \u00f6nce kullan\u0131lan k\u00fct\u00fcphanelerin ne i\u015fe yarad\u0131\u011f\u0131 hakk\u0131nda k\u0131saca bilgi verelim.<\/p>\n<p><strong><a href=\"https:\/\/github.com\/python-mechanize\/mechanize\" target=\"_blank\" rel=\"noopener\">Mechanize K\u00fct\u00fcphanesi<\/a><\/strong><\/p>\n<p>Mechanize k\u00fct\u00fcphanesi k\u0131saca g\u00f6rsel bir aray\u00fcz\u00fc olmayan kod taraf\u0131nda \u00e7al\u0131\u015fan \u00e7ok basit bir taray\u0131c\u0131 yaratman\u0131z\u0131 sa\u011flar. Mechanize k\u00fct\u00fcphanesi kendi i\u00e7inde bulunan &#8220;urllib2&#8221; k\u00fct\u00fcphanesinide destekleyerek kullan\u0131r. Mechanize&#8217;nin bize yani programc\u0131ya sa\u011flad\u0131\u011f\u0131 kolayl\u0131klar\u0131 ve bar\u0131nd\u0131rd\u0131\u011f\u0131 \u00f6zellikleri s\u0131ralayacak olursak :<\/p>\n<ul>\n<li>URL \u015femalar\u0131n\u0131 \u00e7\u0131karma<\/li>\n<li>Cookie ve session tutma<\/li>\n<li>robots.txt dosyas\u0131 \u00e7\u0131karma<\/li>\n<li>Yeniden y\u00f6nlendirmeleri tan\u0131ma<\/li>\n<li>Proxy kullanma<\/li>\n<li>FTP ve HTTP ba\u011flant\u0131 kurma<\/li>\n<li>&#8230;<\/li>\n<\/ul>\n<p>K\u00fct\u00fcphane Gisle Aas, Johnny Lee ve Andy Lester taraf\u0131ndan Perl ile yaz\u0131lm\u0131\u015ft\u0131r.<br \/>\nK\u00fct\u00fcphaneyi sistemimize kurmak i\u00e7in linux ortam\u0131nda \u015fu kodu kullanabiliriz:<br \/>\n<code>$ pip install mechanize<\/code><br \/>\nK\u00fct\u00fcphaneyle ilgili d\u00f6k\u00fcmantasyona ve github deposuna ba\u015fl\u0131ktaki linkten ula\u015fabilirsiniz.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-143 size-full\" src=\"http:\/\/talhacelik.com.tr\/wp-content\/uploads\/2018\/01\/10.1.jpg\" alt=\"\" width=\"250\" height=\"298\" \/><\/p>\n<p><strong><a href=\"https:\/\/www.crummy.com\/software\/BeautifulSoup\/bs4\/doc\/\" target=\"_blank\" rel=\"noopener\">Beautiful Soup K\u00fct\u00fcphanesi<\/a><\/strong><\/p>\n<p>Beautiful Soup k\u00fct\u00fcphanesinin genel kullan\u0131m amac\u0131 HTML ve XML verileri parse etmektir yani \u00f6rne\u011fin parametre olarak girilen bir XML verisini anlaml\u0131 hale getirebilmektir. Burada bize sa\u011flad\u0131\u011f\u0131 en b\u00fcy\u00fck avantaj girilen HTML yada XML verisi i\u00e7erisinde hatal\u0131 girdilerin olmas\u0131 ihtimaline kar\u015f\u0131 yinede girilen veriyi ba\u015far\u0131l\u0131 bir \u015fekilde anlamland\u0131rabilmesi ayr\u0131\u015ft\u0131rabilmesidir.<br \/>\nBeautiful Soup Python dili ile geli\u015ftirilmi\u015ftir ve slogan\u0131 ise &#8220;A tremendous boon&#8221;dur yani Muhte\u015fem bir nimet \ud83d\ude42 .<br \/>\nK\u00fct\u00fcphaneyi sistemimize kurmak i\u00e7in linux ortam\u0131nda \u015fu kodu kullanabiliriz:<br \/>\n<code>$ pip install beautifulsoup4<\/code><br \/>\nK\u00fct\u00fcphaneyle ilgili d\u00f6k\u00fcmantasyona ve github deposuna ba\u015fl\u0131ktaki linkten ula\u015fabilirsiniz.<\/p>\n<p>Mod\u00fcller hakk\u0131nda bilgi verdikten sonra gelelim python scrtiptimize.<\/p>\n<p><strong>Ne yap\u0131yor bu program ?<\/strong><br \/>\nProgram\u0131n amac\u0131 \u00e7ok basit. \u0130nternetin oldu\u011fu her ortamda ki i\u015fletim sistemi fark\u0131 problem olmadan ister grafiksel aray\u00fcz olmadan ister konsol \u00fczerinden girilen dili alg\u0131layarak istenilen dile \u00e7eviren bir script.<\/p>\n<p><strong>Peki bunu nas\u0131l yap\u0131yor ?<\/strong><br \/>\nAsl\u0131nda her\u015fey bu sat\u0131rda bitiyor.<\/p>\n<p><code>_url = \"https:\/\/translate.google.com.tr\/m?hl=tr&amp;sl=auto&amp;tl={0}&amp;ie=UTF-8&amp;prev=_m&amp;q=\".format(self.to_language)<\/code><\/p>\n<p>Dilin alg\u0131lanmas\u0131 i\u015flemi URL \u00fczerinden otomatik olarak zaten yap\u0131l\u0131yor, bize sadece \u00e7evirileccek dili girmek kal\u0131yor. Devam\u0131nda ise<\/p>\n<p><code> html_codes = browser.open(_url).read()<br \/>\nsoup = BeautifulSoup(html_codes, 'html.parser')<br \/>\ntranslate_word = soup.find_all('div', attrs={'class': 't0'})<\/code><\/p>\n<p>kodu \u00f6nce siteyi okuyup ard\u0131ndan parse ediyor ki elimizde anlaml\u0131 ve sade veriler kals\u0131n.<\/p>\n<p>Son olarak parse edilmi\u015f veri i\u00e7erisinden <strong>class<\/strong> ismi <strong>t0<\/strong> olan <strong>div i<\/strong> se\u00e7ip ufak string temizleme i\u015flemleri ile user\u0131n anlayabilece\u011fi bir hale getiriyor ard\u0131ndan return ediyoruz.<\/p>\n<p><code>  clear_text = str(translate_word[0])<br \/>\nclear_text = clear_text[26:len(clear_text)]<br \/>\nclear_text = clear_text[:-6]<br \/>\nbrowser.close()<br \/>\nreturn clear_text<\/code><\/p>\n<p>Asl\u0131nda program temel olarak mechanize ile bir browser a\u00e7\u0131yor, ard\u0131ndan hedef URL&#8217;in kaynak kod bilgilerini ediniyor, bu esnada BeautifulSoup mod\u00fcl\u00fc gelen HTML verisini <strong>html.parser<\/strong> ile temizleyerek yeniden <strong>html_codes<\/strong> de\u011fi\u015fkenine at\u0131yor. Son olarak \u00e7evirilmesi tamamlanm\u0131\u015f kelimeyi temizleyerek kullan\u0131c\u0131ya sunmak kal\u0131yor.<\/p>\n<p>Projenin t\u00fcm kodlar\u0131n\u0131 g\u00f6rmek i\u00e7in <a href=\"https:\/\/github.com\/tlhcelik\/term-translator\/blob\/master\/translate.py\" target=\"_blank\" rel=\"noopener\">github\/tlhcelik\/term-translator<\/a> adresine gitmeniz yeterli olacakt\u0131r.<\/p>\n<p>Okudu\u011funuz i\u00e7in te\u015fekk\u00fcrler \ud83d\ude42<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u00d6rnek olarak haz\u0131rlanm\u0131\u015f python scriptindeki kodlar\u0131 g\u00f6rmeden \u00f6nce kullan\u0131lan k\u00fct\u00fcphanelerin ne i\u015fe yarad\u0131\u011f\u0131 hakk\u0131nda k\u0131saca bilgi verelim. Mechanize K\u00fct\u00fcphanesi Mechanize k\u00fct\u00fcphanesi k\u0131saca g\u00f6rsel bir aray\u00fcz\u00fc olmayan kod taraf\u0131nda \u00e7al\u0131\u015fan \u00e7ok basit bir taray\u0131c\u0131 yaratman\u0131z\u0131 sa\u011flar. Mechanize k\u00fct\u00fcphanesi kendi i\u00e7inde bulunan &#8220;urllib2&#8221; k\u00fct\u00fcphanesinide destekleyerek kullan\u0131r. Mechanize&#8217;nin bize yani programc\u0131ya sa\u011flad\u0131\u011f\u0131 kolayl\u0131klar\u0131 ve bar\u0131nd\u0131rd\u0131\u011f\u0131 \u00f6zellikleri s\u0131ralayacak olursak &hellip;<\/p>\n","protected":false},"author":1,"featured_media":144,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12],"tags":[29,28,26,27],"_links":{"self":[{"href":"https:\/\/talhacelik.com.tr\/index.php\/wp-json\/wp\/v2\/posts\/123"}],"collection":[{"href":"https:\/\/talhacelik.com.tr\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/talhacelik.com.tr\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/talhacelik.com.tr\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/talhacelik.com.tr\/index.php\/wp-json\/wp\/v2\/comments?post=123"}],"version-history":[{"count":22,"href":"https:\/\/talhacelik.com.tr\/index.php\/wp-json\/wp\/v2\/posts\/123\/revisions"}],"predecessor-version":[{"id":147,"href":"https:\/\/talhacelik.com.tr\/index.php\/wp-json\/wp\/v2\/posts\/123\/revisions\/147"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/talhacelik.com.tr\/index.php\/wp-json\/wp\/v2\/media\/144"}],"wp:attachment":[{"href":"https:\/\/talhacelik.com.tr\/index.php\/wp-json\/wp\/v2\/media?parent=123"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/talhacelik.com.tr\/index.php\/wp-json\/wp\/v2\/categories?post=123"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/talhacelik.com.tr\/index.php\/wp-json\/wp\/v2\/tags?post=123"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}