Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

MarkusH
Copy link
Member

@MarkusH MarkusH commented Feb 12, 2017

This is a first shot at it would or could look like. There are probably dozens more things to look into and consider.

$ cat app/models.py
from django.db import models
from django.db.models.expressions import Ref, Value
from django.db.models.functions import Concat, Lower


class Foo(models.Model):
    name = models.CharField(max_length=255)

    class Meta:
        indexes = [
            models.FuncIndex(
                Concat(
                    Lower(Ref('name', None), output_field=models.CharField()),
                    Value('blub'),
                    output_field=models.CharField(),
                ),
                name='some_func_index'
            ),
        ]
$ python manage.py makemigrations -v 3
Migrations for 'app':
  app/migrations/0001_initial.py
    - Create model Foo
    - Create index some_func_index for model foo
$ cat app/migrations/0001_initial.py
# Generated by Django 2.0.dev20170212181215 on 2017-02-12 18:14

from django.db import migrations, models
import django.db.models.expressions
import django.db.models.functions.base


class Migration(migrations.Migration):

    initial = True

    dependencies = [
    ]

    operations = [
        migrations.CreateModel(
            name='Foo',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('name', models.CharField(max_length=255)),
            ],
        ),
        migrations.AddIndex(
            model_name='foo',
            index=models.FuncIndex(expression=django.db.models.functions.base.Concat(django.db.models.functions.base.Lower(django.db.models.expressions.Ref('name', None), output_field=models.CharField()), django.db.models.expressions.Value('blub'), output_field=models.CharField()), name='some_func_index'),
        ),
    ]
$ python manage.py migrate app
Operations to perform:
  Apply all migrations: app
Running migrations:
  Applying app.0001_initial... OK
$ sqlite3 db.sqlite3
SQLite version 3.16.2 2017-01-06 16:32:41
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE IF NOT EXISTS "django_migrations" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "app" varchar(255) NOT NULL, "name" varchar(255) NOT NULL, "applied" datetime NOT NULL);
CREATE TABLE IF NOT EXISTS "app_foo" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "name" varchar(255) NOT NULL);
CREATE INDEX "some_func_index" ON "app_foo" (COALESCE(LOWER("name"), '') || COALESCE('blub', ''));
sqlite> .quit

@charettes
Copy link
Member

Out of curiosity, what's the reason the example index can't be defined as

Concat(Lower(F('name')), Value('blub'))

@MarkusH
Copy link
Member Author

MarkusH commented Feb 13, 2017

@charettes Apparently, I haven't fully understood it, F() looks up fields in the Compiler and refers to already mentioned fields. However, the Compiler used here doesn't have any mentioned fields, thus using F() fails. But that's something I want to get to: Concat(Lower('name'), Value('blub'))

@jarshwah
Copy link
Member

It's been a little while since I looked into this, but you can't use F() because:

  1. F doesn't define an as_sql() method because;
  2. F must be resolved by the query, which turns it into either a Col or a Ref (I suspect a Col would work as well), which do have as_sql() methods
  3. To resolve an F, you need to hand it a query (+ joins), and the query is used to setup joins if needed.

You might be able to work around this by passing F.resolve_expression a fake query object with a resolve_ref method that simply verifies the column existing on the model, and returning the appropriate Col or Ref. Col looks like it outputs "table.column" while Ref outputs "column_alias" which can just be the column name.

You could try building non-query support into the F expression by overloading the resolve_expression method, but that kind of defeats the purpose of using F anyway, while complicating the code. F doesn't do the column/reference verification, that's handled by query.resolve_ref.

Hope that helps. Also, nice work! I really look forward to seeing this available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be on F. I fixed this in my rebase of #7517.

@MarkusH MarkusH force-pushed the ticket26167-func-indexes branch from 9fe867d to 4d430fb Compare March 3, 2017 22:50
@MarkusH MarkusH force-pushed the ticket26167-func-indexes branch from 8eb1989 to b2a6e82 Compare April 5, 2017 10:49
@MarkusH MarkusH force-pushed the ticket26167-func-indexes branch 2 times, most recently from f473af1 to 0d77c47 Compare May 7, 2017 13:35
@MarkusH
Copy link
Member Author

MarkusH commented May 7, 2017

This is a second shot at how function based indexes could look like.

$ cat app/models.py
from django.db import models
from django.db.models.functions import Lower


class Foo(models.Model):
    name = models.CharField(max_length=255)

    class Meta:
        indexes = [
            models.Index(
                [Lower('name').desc()],
                name='some_func_index'
            ),
        ]
$ python manage.py makemigrations -v 3
Migrations for 'app':
  app/migrations/0001_initial.py
    - Create model Foo
    - Create index some_func_index for model foo
$ cat app/migrations/0001_initial.py
# Generated by Django 2.0.dev20170507133517 on 2017-05-07 13:52

from django.db import migrations, models
import django.db.models.expressions
import django.db.models.functions.base


class Migration(migrations.Migration):

    initial = True

    dependencies = [
    ]

    operations = [
        migrations.CreateModel(
            name='Foo',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('name', models.CharField(max_length=255)),
            ],
        ),
        migrations.AddIndex(
            model_name='foo',
            index=models.Index(fields=[django.db.models.expressions.OrderBy(django.db.models.functions.base.Lower('name'), descending=True)], name='some_func_index'),
        ),
    ]

On SQLite3

$ python manage.py sqlmigrate app 0001
BEGIN;
--
-- Create model Foo
--
CREATE TABLE "app_foo" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "name" varchar(255) NOT NULL);
--
-- Create index some_func_index on field(s) OrderBy(Lower(F(name)), descending=True) of model foo
--
CREATE INDEX "some_func_index" ON "app_foo" (LOWER("name") DESC);
COMMIT;
$ python manage.py migrate app
Operations to perform:
  Apply all migrations: app
Running migrations:
  Applying app.0001_initial... OK
$ python manage.py dbshell
SQLite version 3.16.2 2017-01-06 16:32:41
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE IF NOT EXISTS "django_migrations" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "app" varchar(255) NOT NULL, "name" varchar(255) NOT NULL, "applied" datetime NOT NULL);
CREATE TABLE IF NOT EXISTS "app_foo" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "name" varchar(255) NOT NULL);
CREATE INDEX "some_func_index" ON "app_foo" (LOWER("name") DESC);
sqlite> .quit

On PostgreSQL

$ python manage.py sqlmigrate app 0001 --settings=testproject.pg_settings
BEGIN;
--
-- Create model Foo
--
CREATE TABLE "app_foo" ("id" serial NOT NULL PRIMARY KEY, "name" varchar(255) NOT NULL);
--
-- Create index some_func_index on field(s) OrderBy(Lower(F(name)), descending=True) of model foo
--
CREATE INDEX "some_func_index" ON "app_foo" (LOWER("name") DESC);
COMMIT;
$ python manage.py migrate app --settings=testproject.pg_settings
Operations to perform:
  Apply all migrations: app
Running migrations:
  Applying app.0001_initial... OK
$ python manage.py dbshell --settings=testproject.pg_settings
psql (9.6.2)
Type "help" for help.

django=> \dS+ app1_foo
                                                     Table "public.app1_foo"
 Column |          Type          |                       Modifiers                       | Storage  | Stats target | Description 
--------+------------------------+-------------------------------------------------------+----------+--------------+-------------
 id     | integer                | not null default nextval('app1_foo_id_seq'::regclass) | plain    |              | 
 name   | character varying(255) | not null                                              | extended |              | 
Indexes:
    "app1_foo_pkey" PRIMARY KEY, btree (id)
    "some_func_index" btree (lower(name::text) DESC)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the index is named title_lower_id this LOWER check could check against that as well as lower(xxx) -- please rename the index so there is only one match for LOWER (same for TITLE)

@felixxm
Copy link
Member

felixxm commented May 8, 2017

I found this error:

======================================================================
ERROR: test_func_index (schema.tests.SchemaTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/django/tests/schema/tests.py", line 1771, in test_func_index
    sql = func_index.create_sql(Book, editor)
  File "/django/django/db/models/indexes.py", line 78, in create_sql
    sql_parameters = self.get_sql_create_template_values(model, schema_editor, using)
  File "/django/django/db/models/indexes.py", line 58, in get_sql_create_template_values
    expression = column_expression.resolve_expression(query)
  File "/django/django/db/models/functions/datetime.py", line 60, in resolve_expression
    field = copy.lhs.output_field
  File "/django/django/utils/functional.py", line 31, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
  File "/django/django/db/models/expressions.py", line 226, in output_field
    raise FieldError("Cannot resolve expression type, unknown output_field")
django.core.exceptions.FieldError: Cannot resolve expression type, unknown output_field

when I had added index with ExtractMonth to the test_func_index:

func_index = Index(fields=[ExtractMonth('pub_date')], name='pub_data_month_idx')
with connection.schema_editor() as editor:
    sql = func_index.create_sql(Book, editor)

@MarkusH
Copy link
Member Author

MarkusH commented May 8, 2017

Thanks @apollo13 @felixxm . I think I took care of both your comments. The tests do need some major improvements.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.assertTrue -> self.assertIn above and below as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂️ Thanks!

@MarkusH MarkusH force-pushed the ticket26167-func-indexes branch 2 times, most recently from 9f1972f to 4262296 Compare May 8, 2017 10:13
@MarkusH MarkusH force-pushed the ticket26167-func-indexes branch from 4262296 to a593989 Compare July 21, 2017 02:17
@MarkusH MarkusH changed the title [WIP] Fixed #26167 -- Added support for functional indexes Fixed #26167 -- Added support for functional indexes Jul 21, 2017
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you can remove "%s" %, i.e.

return connection.ops.quote_name(self.alias), []

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's also worth to add test and mention in doc that it's possible to create an index with math expressions, e.g.:

Index(fields=[F('height')*F('weight')], name='mul_idx')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a similar restriction on Oracle (see doc). Function must be declared as DETERMINISTIC.

@MarkusH
Copy link
Member Author

MarkusH commented Jul 23, 2017

Thanks @felixxm. Updated.

@MarkusH MarkusH force-pushed the ticket26167-func-indexes branch 5 times, most recently from daf02f7 to 292c455 Compare July 25, 2017 05:07
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertRaisesRegex should be avoided, I believe, cc @timgraham

I've seen a pattern in the migration tests where self.assertIn is used to check for parts of the message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertRaisesMessage uses assertIn so it works just as well without the need for the .*.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use current instead of 9.6

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inherit from django.db.NotSupportedError instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@MarkusH MarkusH force-pushed the ticket26167-func-indexes branch from 43c87b7 to 0cd5fb9 Compare July 27, 2017 00:51
Copy link
Member

@jarshwah jarshwah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only done cursory code review, but it's looking nice! I haven't actually executed any of the code yet.

I'd encourage you to write tests as if you were trying to break Django. What happens if two indexes are created with the same name on the same model? Same name on different models? Should we namespace them perhaps? etc

I think this will be good to merge before cutoff. It's in really good shape, and it sounds like others have tested it quite well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Is this an error or not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's an error on platforms w/o support for expression indexes to create one, the presence of an index is often not required. Thus, I'm raising warning instead of blowing up. That's the only way I can imagine for 3rd party packages using expression indexes to use them while supporting all databases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm alias generally refers to the table name - not the column name. You're changing the meaning of alias depending on the index_col.

Maybe just pass None, 'name', index_col=True ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we do something similar in order_by? Consider refactoring into a helper?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the occurrences of similar things in the order_by() and OrderBy() context (django.db.models.sql.query:add_ordering and django.db.models.sql.compiler:get_order_by) are based on using ASC / DESC over .asc() / .desc(). Further, the compiler related functions lookup table aliases and whatnot which we don't have in the index context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lastly, the code doing ordering is already around. It's only updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah no need to worry about this. I didn't look up the other function, it was just from memory. Hard to refactor into common method so forget it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should fix the args into here as mentioned above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is cute <

@MarkusH MarkusH force-pushed the ticket26167-func-indexes branch from ff78154 to b4f6cbf Compare August 8, 2017 01:55
Copy link
Contributor

@atombrella atombrella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small style comment

@skipUnlessDBFeature('supports_expression_indexes')
def test_func_index_invalid_field(self):
func_index = Index(fields=[Lower('blub')], name='dolor_idx')
with self.assertRaisesMessage(ValueError, "Invalid reference to field 'blub' on model 'schema.Author'"):

This comment was marked as outdated.

@atombrella
Copy link
Contributor

atombrella commented Sep 8, 2017

@MarkusH I had a look in get_constraints and in PostgreSQL's backend the expression definition is part of this. Would it be an idea to include tests for this? The SQL for the index is only included in cases like this where you use an expression. However, a cast may be added to the SQL, e.g., lower(column) is lower(column::text) if column is not of type text. Oracle probably has something similar to pg_get_indexdef. SQLite doesn't seem like it does.

Please do rebase.

@bluetech
Copy link
Contributor

bluetech commented Oct 12, 2017

Would be nice to have a test case using a subclass of Index, just to ensure it works. Example (IIUC):

from django.db import models, Func, F
from django.contrib.postgres.indexes import GistIndex


class Location(models.Model):
    lat = models.FloatField()
    lon = models.FloatField()

    class Meta:
        indexes = [
            GistIndex(fields=[Func(F('lat'), F('lon'), function='ll_to_earth')]),
        ]

should result in

CREATE INDEX idx_name ON app_location USING gist(ll_to_earth("lat", "lng"));

if not any(f.name == name for f in self.model._meta.concrete_fields):
raise ValueError("Invalid reference to field '%s' on model '%s'" % (name, self.model._meta.label))
source_field = self.model._meta.get_field(name)
return Col(None, source_field, index_col=True) # None refers to the table alias we don't have here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts about using a Col subclass here instead of this flag? You'd simply have to return IndexCol(None, source_field).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a very similar requirement in my work on Check Constraints: 1a5ade0#diff-fe637a3ec6af9cbe2ee87293000e7c82R751

It's probably worthwhile checking if we can share some implementation details here.

contains_column_references = True

def __init__(self, alias, target, output_field=None):
def __init__(self, alias, target, output_field=None, index_col=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should create a Col subclass for this purpose instead.

e.g.

class IndexCol(Col):
    def as_sql(self, compiler, connection):
        return connection.ops.quote_name(self.target.column), []

Copy link

@gilmarsoares-luizalabs gilmarsoares-luizalabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link

@ariadyne-luizalabs ariadyne-luizalabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

@schinckel
Copy link
Contributor

I've been working on a CalculatedField, which has a significant overlap with this (in the sense of that the only part of my field that does not work is when you try to create an index on it).

https://github.com/schinckel/django-computed-field

Ignoring the fact I have to currently sniff up the stack to resolve the query (I'm thinking about trying to use Ref instead of F though, based on reading the stuff here), I wonder if this is a solution to this problem that could be considered. You could (hopefully) just use the computed field by name in the index stuff, and then it would handle all of the expression handling.

@spumer
Copy link

spumer commented Sep 2, 2019

Any updates?

@LuRsT
Copy link
Contributor

LuRsT commented Sep 17, 2019

@MarkusH this would be great to have in, any plans on getting back to it?

@felixxm
Copy link
Member

felixxm commented Oct 17, 2019

Superseded by #11929

@felixxm felixxm closed this Oct 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.