You're viewing all posts tagged with python

Dynamic ModelForm creation

That looks amazing!

def get_model_form_class(model_class, fields_list=None, exclude_list=None):
    class form_class(forms.ModelForm):
        class Meta:
            model = model_class
            fields = fields_list
            exclude = exclude_list
    return form_class

The idea taken from: http://stackoverflow.com/questions/297383/dynamically-update-modelforms-meta-class/297478#297478

Comments: 45

Shortcut for printing user’s full name or username in template

That’s it:

from django import template

@register.filter
def nice_name(user):
    """
    Example::

        Hi, {{ user|nice_name }}
    """
    return user.get_full_name() or user.username
Comments: 45

How to extract html page title by URL

Actually the subject can be divided into two tasks:

  • retreive data
  • extract information from it

There’s standard library urllib2 in Python for retreiving data over HTTP and a number of libraries for parsing HTML data. I’ll use html5lib in this example.

First iteration of retrieving data

import urllib2

def read_url(url):
    try:
        response = urllib2.urlopen(url)
    except urllib2.URLError:
        return u''
    encoding = get_charset(response.headers)
    return unicode(data, encoding)

We need extra utility function get_charset:

def get_charset(headers, default='utf-8'):
    try:
        content_type = headers['content-type'].lower()
        if content_type.find('charset=') > 0:
            return content_type.split('charset=')[-1].lower()
    except KeyError:
        pass
    return default

Now we can get data!

>>> d = read_url('http://python.org')
>>> d[:50]
u'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Trans'

Seems like that’s what wee need.

Extracting title with html5lib

There are examples for it: http://www.sal.ksu.edu/faculty…

Here’s extractor function based on that examples:

from html5lib import HTMLParser, treebuilders, treewalkers

parser = HTMLParser(tree=treebuilders.getTreeBuilder("dom"))
walker = treewalkers.getTreeWalker("dom")

def extract_title(html):
    domtree = parser.parse(html)
    titleNode = False
    title = u''
    for token in walker(domtree):
        if token['type'] == 'StartTag' and token['name'] == 'title':
            titleNode = True
        elif titleNode:
            if token['type'] == 'EndTag' and token['name'] == 'title':
                break
            elif token.has_key('data'):
                title += token['data']
    return title.strip()

Let’s try!

>>> extract_title(d)
u'Python Programming Language -- Official Website'

Amazing! That’s working!

Optimization, possibly

The one drawback of extraction method above is that page has to be completely downloaded and parsed for title extraction. I’ve tried to optimize it: read HTTP data just until title data is read.

Here’s read_url function revisited. It’s designed to read data by chunks until specified string is met.

import re
import urllib2

def read_url(url, until=None, chunk=100):
    try:
        response = urllib2.urlopen(url)
    except urllib2.URLError:
        return u''

    encoding = get_charset(response.headers)

    if until:
        next, data, trunk_at = True, '', None
        while next:
            next = response.read(chunk)
            data += next
            until_match = re.search(until, data, re.IGNORECASE)
            if until_match:
                response.close()
                data = unicode(data, encoding)
                return data[:data.find(until) + len(until)]
    else:
        data = response.read()
    return unicode(data, encoding)

So, we can now read until </title>!

>>> d = read_url('http://python.org/', until='</title>')
>>> d
u'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html xmlns
="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">\n\n<head>\n  <meta http-equiv="content-type" content="text/html; charset=utf-8" />\
n  <title>Python Programming Language -- Official Website</title>'

Let’s test perfomance. The very-very basic test looks like:

def test():
    from time import time

    t1 = time()
    d1 = read_url('http://python.org/', until='</title>')
    t2 = time()

    t3 = time()
    d2 = read_url('http://python.org/')
    t4 = time()

    print t2-t1
    print t4-t3

Results:

>>> test()
0.131000041962
0.31500005722
>>> test()
0.12700009346
0.318000078201
>>> test()
0.125999927521
0.31299996376

Optimized extractor shown considerable faster results.

That’s it

Other HTML parsing libraries are mentioned here.

Comments: 45
Comments: 45

Unicode ‘funny characters’

There’re characters that sometimes cause strange behaviour, when trying to print them to console.

It seems that depends on environment and Python compilation. I’ve tested it on Windows Vista and it failed, than it worked on some *nix machines and failed on others. I used Python 2.6.x version. So, it’s possible you will be unable to rebpoduce it!

An example

Russian character ‘ы’ has code U+044B, symbol ‘©’ has code U+00A9.

    >>> a = u'ы'
    >>> a
    u'\u044b'
    >>> b = u'\u00a9'
    >>> b
    u'\xa9'

Trying to print:

    >>> print a
    ы
    >>> print b
    ...
    UnicodeEncodeError: 'charmap' codec can't encode character u'\xa9' in position 0: character maps to <undefined>
    >>>

Try to write to file:

    >>> f = open('test.txt', 'w')
    >>> f.write((a + b).encode('utf-8'))
    >>> f.close()

That’s working ok!

What to do

Walk wide.

I’ve found a suggestion to remove such charecters from text: http://code.activestate.com/recipes/546517-accent2htmlcodepy-convert-accents-and-special-char/

Possibly that’s not best solution, but may be nesseccary, if you want to print text with ‘funny characters’ to console.

_spec_chars = [u'\xc1',u'\xe1',u'\xc0',u'\xc2',u'\xe0',u'\xc2',u'\xe2',u'\xc4',u'\xe4',u'\xc3',u'\xe3',u'\xc5',u'\xe5',u'\xc6',u'\xe6',u'\xc7',u'\xe7',u'\xd0',u'\xf0',u'\xc9',u'\xe9',u'\xc8',u'\xe8',u'\xca',u'\xea',u'\xcb',u'\xeb',u'\xcd',u'\xed',u'\xcc',u'\xec',u'\xce',u'\xee',u'\xcf',u'\xef',u'\xd1',u'\xf1',u'\xd3',u'\xf3',u'\xd2',u'\xf2',u'\xd4',u'\xf4',u'\xd6',u'\xf6',u'\xd5',u'\xf5',u'\xd8',u'\xf8',u'\xdf',u'\xde',u'\xfe',u'\xda',u'\xfa',u'\xd9',u'\xf9',u'\xdb',u'\xfb',u'\xdc',u'\xfc',u'\xdd',u'\xfd',u'\xff',u'\xa9',u'\xae',u'\u2122',u'\u20ac',u'\xa2',u'\xa3',u'\u2018',u'\u2019',u'\u201c',u'\u201d',u'\xab',u'\xbb',u'\u2014',u'\u2013',u'\xb0',u'\xb1',u'\xbc',u'\xbd',u'\xbe',u'\xd7',u'\xf7',u'\u03b1',u'\u03b2',u'\u221e']

def cleanspec(s, cleaned=_spec_chars):
    return ''.join([(c in cleaned and ' ' or c) for c in s])

Try print cleaned text:

    >>> print cleanspec(b + a)
     ы

That is workaround. May be, that’s an issue for Python’s print, I’m not quite sure about that.

That’s it :)

Tags: python unicode  
Comments: 45

Next order value for Django model instance

Assume, we have Django model with special weight integer field for ordering. We may want to assign its value automatically on save. Here’s the snippet implementing such behaviour:

class MyModel(models.Model):
    # some fields...
    weight = models.IntegerField(default=0)

    class Meta:
        ardering = ('weight',)

    def save(self, *args, **kwargs):
        self.weight = get_next_value(self, field_name='weight')
        super(MyModel, self).save(*args, **kwargs)

Here’s the get_next_value implementation:

def get_next_value(instance, field_name='order', step=10, **filter):
    model = instance.__class__
    qs = model.objects.order_by('-%s' % field_name)
    if filter:
        qs = qs.filter(**filter)
    try:
        max_value = getattr(qs[:1][0], field_name, 0)
    except IndexError:
        max_value = 0
    return (max_value / step + 1) * step

The implementation above can handle any field name with specified step. It can also provide next field value for filtered queryset.

Provided snippet is a part of halfbit-web-helpers collection.

Comments: 45

Smarter Django cart

Introduction

django-carting is a basic online store application for Django. It is designed in a sketchy manner to be like a rewritable application.

Demo is available:
http://carting-demo.05bit.com

Concept

It’s conceptually differs both from Satchmo and LFS projects. Basically, that’s just a “cart application” with utilities, which can used to build full-featured online store.

The two points are:

  • products catalog is always too custom to be customized
  • design and html layouts are usually totally new for commercial project

So, that parts should be rewritten from scratch each time.

Feature

Cart is smartly binded to user or session. It binds to authenticated user or stores in session for anonymous user. So, cart is remebered for authenticated user and does not dissappear when session expires.

Projects powered by

Comments: 45

Smarter Django project configuration

Two points

  1. Generally we need different settings for Django project in development environment on localhost and on production environment. Settings may differ also across different development environments when application is developed by many programmers.

  2. Project source is good to be stored in some VCS repository, so settings should also be stored there. Settings modifications that are common for different environments should be automatically applied after updating from repository.

How to

Here’s a couple of recipes on how to configure Django project in a bit smarter way.

1. Define settings template

Just make a copy of settings.py to settings_template.py and then import settings_template from settings:

from settings_template import *

Then you may override some settings, i.e. DUBUG flag or database settings etc.:

from settings_template import *

DEBUG = False

Note: now settings.py should not be stored in repository, but settings_template.py should be stored there.

2. Define paths based on project directory

I know many good programmers do that :) Add somewhere at start of settings, I mean settings_template.py:

import os

PROJECT_ROOT = os.path.realpath(os.path.dirname(__file__))

Now you can define media and template paths like this:

MEDIA_ROOT = os.path.join(PROJECT_ROOT, 'media')

MEDIA_URL = '/media/'

ADMIN_MEDIA_PREFIX = '/media/admin/'

TEMPLATE_DIRS = (
    os.path.join(PROJECT_ROOT, 'templates'),
)

3. Configure serving static files

Django documentation offers to serve on not to serve static files depenging on DEBUG flag. Sometimes that may not be correct, as we may need debug mode on production environment, where static files are served by webserver.

I offer to use separate SERVE_STATIC flag in settings for this. So, here is urls.py example:

from django.conf import settings

# some urls config ...

if getattr(settings, 'SERVE_STATIC', False):
    urlpatterns += patterns('',
        (r'^' + settings.MEDIA_URL[1:] + r'(?P<path>.*)$',
         'django.views.static.serve',
         {'document_root': settings.MEDIA_ROOT})
    )

So, when I need to serve static files by development server, I just add SERVE_STATIC = True to settings.py.

Feedbacks

Please give a feedback, if it was useful or not. I hope it was :) If you have your own recipes on subject, you’re welcome to share links or snippets.

Comments: 45

Python logging example and helper

When trying to find simple example on how to use Python logging module for writing logs to file, I became frustrated, as there are only a few useful examples. Here’s one that really works and covers most of use cases: see it on mechanicalcat.net.

And here’s simple helper to open log file, based on that example:

import re
import logging

def openlog(filename, logger_name=None, level=logging.DEBUG, format='%(asctime)s %(levelname)s %(message)s'):
    if not logger_name:
        logger_name = filename[filename.rfind('/')+1:]
    logger = logging.getLogger(logger_name)
    handler = logging.FileHandler(filename)
    formatter = logging.Formatter(format)
    handler.setFormatter(formatter)
    logger.addHandler(handler)
    logger.setLevel(level)
    return logger

So, you can simply use this helper like this:

logger = openlog('/var/log/just_a_test.log')
logger.debug("hello, i'm debug message!")

Actually, basic examples are provided in docs. I missed them first time, because reference seems to be complicated at start point.

Comments: 45

Flavoured Markdown

Markdown is good, but it has 2 features that’s hard to explain to inexperienced user:

  • it doesn’t convert urls to links automatically
  • it doesn’t convert line breaks to html <br> tags

It seems, for 80% cases this would better be done. Python Markdown supports extensions and thus can be “flavoured”.

So, there are extensaions:

Comments: 45