Content Models

Contents

Content Models#

This is where we expose a small handful of models and model mixins that we want callers to extend or make foreign keys to. Callers importing this module should never instantiate any of the models themselves–there are API functions in authoring.py to create and modify data models in a way that keeps those models consistent.

class openedx_content.models_api.Collection(*args, **kwargs)#

Bases: Model

Represents a collection of library components

Parameters:
  • id (AutoField) – Primary key: Id

  • collection_code (MultiCollationCharField) – Collection code

  • title (MultiCollationCharField) – Title. The title of the collection.

  • description (MultiCollationTextField) – Description. Provides extra information for the user about this collection.

  • enabled (BooleanField) – Enabled. Disabled collections are “soft deleted”, and should be re-enabled before use, or be deleted.

  • created (DateTimeField) – Created

  • modified (DateTimeField) – Modified

Relationship fields:

Parameters:

Reverse relationships:

Parameters:

collectionpublishableentity (Reverse ForeignKey from CollectionPublishableEntity) – All collection publishable entitys of this collection (related name of collection)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.CollectionPublishableEntity(*args, **kwargs)#

Bases: Model

Collection -> PublishableEntity association.

Parameters:

Relationship fields:

Parameters:
exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.Component(*args, **kwargs)#

Bases: PublishableEntityMixin

This represents any Component that has ever existed in a LearningPackage.

What is a Component#

A Component is an entity like a Problem or Video. It has enough information to identify itself and determine what the handler should be (e.g. XBlock Problem), but little beyond that.

A Component will have many ComponentVersions over time, and most metadata is associated with the ComponentVersion model and the Media that ComponentVersions are associated with.

A Component belongs to exactly one LearningPackage.

A Component is 1:1 with PublishableEntity and has matching primary key values. More specifically, Component.pk maps to Component.publishable_entity_id, and any place where the Publishing API module expects to get a PublishableEntity.id, you can use a Component.pk instead.

Identifiers#

Components have a publishable_entity OneToOneField to the publishing app’s PublishableEntity field, and it uses this as its primary key. Please see PublishableEntity’s docstring for how you should use its uuid and key fields.

State Consistency#

The key field on Component’s publishable_entity is derived from the component_type and component_code fields in this model. We don’t support changing the keys yet, but if we do, those values need to be kept in sync.

How build on this model#

Make a foreign key to the Component model when you need a stable reference that will exist for as long as the LearningPackage itself exists.

param component_code:

Component code

type component_code:

~openedx_django_lib.fields.MultiCollationCharField

Relationship fields:

param publishable_entity:

Primary key: Publishable entity (related name: component)

type publishable_entity:

OneToOneField to PublishableEntity

param learning_package:

Learning package (related name: component)

type learning_package:

ForeignKey to LearningPackage

param component_type:

Component type (related name: component)

type component_type:

ForeignKey to ComponentType

Reverse relationships:

param versions:

All versions of this Component (related name of component)

type versions:

Reverse ForeignKey from ComponentVersion

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

property pk#

Mark the .pk attribute as deprecated

class openedx_content.models_api.ComponentType(*args, **kwargs)#

Bases: Model

Normalized representation of a type of Component.

The only namespace being used initially will be ‘xblock.v1’, but we will probably add a few others over time, such as a component type to represent packages of files for things like Files and Uploads or python_lib.zip files.

Make a ForeignKey against this table if you have to set policy based on the type of Components–e.g. marking certain types of XBlocks as approved vs. experimental for use in libraries.

Parameters:
  • id (AutoField) – Primary key: Id

  • namespace (MultiCollationCharField) – Namespace

  • name (MultiCollationCharField) – Name

Reverse relationships:

Parameters:

component (Reverse ForeignKey from Component) – All Components of this component type (related name of component_type)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.ComponentVersion(*args, **kwargs)#

Bases: PublishableEntityVersionMixin

A particular version of a Component.

This holds the media using a M:M relationship with Media via ComponentVersionMedia.

Relationship fields:

Parameters:

Reverse relationships:

Parameters:

componentversionmedia (Reverse ForeignKey from ComponentVersionMedia) – All component version medias of this Component Version (related name of component_version)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.ComponentVersionMedia(*args, **kwargs)#

Bases: Model

Determines the Media for a given ComponentVersion.

An ComponentVersion may be associated with multiple pieces of binary data. For instance, a Video ComponentVersion might be associated with multiple transcripts in different languages.

When Media is associated with a ComponentVersion, it has a path that is unique within the context of that ComponentVersion. This is used as a local file-path-like identifier, e.g. “static/image.png”.

Media is immutable and sharable across multiple ComponentVersions.

Parameters:
  • id (BigAutoField) – Primary key: ID

  • path (MultiCollationCharField) – Path

Relationship fields:

Parameters:
exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.Container(*args, **kwargs)#

Bases: PublishableEntityMixin

A Container is a type of PublishableEntity that holds other PublishableEntities. For example, a “Unit” Container might hold several Components.

For now, all containers have a static “entity list” that defines which containers/components/enities they hold. As we complete the Containers API, we will also add support for dynamic containers which may contain different entities for different learners or at different times.

Parameters:

container_code (MultiCollationCharField) – Container code

Relationship fields:

Parameters:

Reverse relationships:

Parameters:
  • versions (Reverse ForeignKey from ContainerVersion) – All versions of this container (related name of container)

  • unit (Reverse OneToOneField from Unit) – The unit of this container (related name of container)

  • subsection (Reverse OneToOneField from Subsection) – The subsection of this container (related name of container)

  • section (Reverse OneToOneField from Section) – The section of this container (related name of container)

  • testcontainer (Reverse OneToOneField from TestContainer) – The test container of this container (related name of container)

  • containercontainer (Reverse OneToOneField from ContainerContainer) – The container container of this container (related name of base_container)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

static all_subclasses() list[type[Container]]#

Get a list of all installed container types

final classmethod get_container_type() ContainerType#

Get the ContainerType for this type of container, auto-creating it if need be.

property pk#

Mark the .pk attribute as deprecated

static register_subclass(container_subclass: type[Container])#

Register a Container subclass

final static reset_cache() None#

Helper for test cases that truncate the database between tests. Call this to delete the cache used in get_container_type(), which will be invalid after the ContainerType table is truncated.

static subclass_for_type_code(type_code: str) type[Container]#

Get the subclass for the specified container type_code.

classmethod validate_entity(entity: PublishableEntity) None#

Check if the given entity is allowed as a child of this Container type

Subclasses should raise ValidationError if “entity” is invalid.

class openedx_content.models_api.ContainerVersion(*args, **kwargs)#

Bases: PublishableEntityVersionMixin

A version of a Container.

By convention, we would only want to create new versions when the Container itself changes, and not when the Container’s child elements change. For example:

  • Something was added to the Container.

  • We re-ordered the rows in the container.

  • Something was removed to the container.

  • The Container’s metadata changed, e.g. the title.

  • We pin to different versions of the Container.

The last looks a bit odd, but it’s because how we’ve defined the Unit has changed if we decide to explicitly pin a set of versions for the children, and then later change our minds and move to a different set. It also just makes things easier to reason about if we say that entity_list never changes for a given ContainerVersion.

Relationship fields:

Parameters:

Reverse relationships:

Parameters:
  • unitversion (Reverse OneToOneField from UnitVersion) – The unit version of this container version (related name of container_version)

  • subsectionversion (Reverse OneToOneField from SubsectionVersion) – The subsection version of this container version (related name of container_version)

  • sectionversion (Reverse OneToOneField from SectionVersion) – The section version of this container version (related name of container_version)

  • testcontainerversion (Reverse OneToOneField from TestContainerVersion) – The test container version of this container version (related name of container_version)

  • containercontainerversion (Reverse OneToOneField from ContainerContainerVersion) – The container container version of this container version (related name of container_version)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

clean()#

Validate this model before saving. Not called normally, but will be called if anything is edited via a ModelForm like the Django admin.

class openedx_content.models_api.Draft(*args, **kwargs)#

Bases: Model

Find the active draft version of an entity (usually most recently created).

This model mostly only exists to allow us to join against a bunch of PublishableEntity objects at once and get all their latest drafts. You might use this together with Published in order to see which Drafts haven’t been published yet.

A Draft entry should be created whenever a new PublishableEntityVersion is created. This means there are three possible states:

  1. No Draft entry for a PublishableEntity: This means a PublishableEntity was created, but no PublishableEntityVersion was ever made for it, so there was never a Draft version.

  2. A Draft entry exists and points to a PublishableEntityVersion: This is the most common state.

  3. A Draft entry exists and points to a null version: This means a version used to be the draft, but it’s been functionally “deleted”. The versions still exist in our history, but we’re done using it.

It would have saved a little space to add this data to the Published model (and possibly call the combined model something else). Split Modulestore did this with its active_versions table. I keep it separate here to get a better separation of lifecycle events: i.e. this table only changes when drafts are updated, not when publishing happens. The Published model only changes when something is published.

Relationship fields:

Parameters:
exception DoesNotExist#

Bases: ObjectDoesNotExist

class DraftQuerySet(model=None, query=None, using=None, hints=None)#

Bases: QuerySet

Custom QuerySet/Manager so we can chain common queries.

with_unpublished_changes()#

Drafts with versions that are different from what is Published.

This will not return Drafts that have unpublished changes in their dependencies. Example: A Unit is published with a Component as one of its child. Then someone modifies the draft of the Component. If both the Unit and the Component Drafts were part of the queryset, this method would return only the changed Component, and not the Unit. (We can add this as an optional flag later if we want.)

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.DraftChangeLog(*args, **kwargs)#

Bases: Model

There is one row in this table for every time Drafts are created/modified.

There are some operations that affect many Drafts at once, such as discarding changes (i.e. reset to the published versions) or doing an import. These would be represented by one DraftChangeLog with many DraftChangeLogRecords in it–one DraftChangeLogRecord for every PublishableEntity that was modified.

Even if we’re only directly changing the draft version of one PublishableEntity, we will get multiple DraftChangeLogRecords if changing that entity causes side-effects. See the docstrings for DraftChangeLogRecord and DraftSideEffect for more details.

Parameters:

Relationship fields:

Parameters:

Reverse relationships:

Parameters:

records (Reverse ForeignKey from DraftChangeLogRecord) – All records of this Draft Change Log (related name of draft_change_log)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.DraftChangeLogRecord(*args, **kwargs)#

Bases: Model

A single change in the PublishableEntity that Draft points to.

Within a single DraftChangeLog, there can be only one DraftChangeLogRecord per PublishableEntity. If a PublishableEntity goes from v1 -> v2 and then v2 -> v3 within the same DraftChangeLog, the expectation is that these will be collapsed into one DraftChangeLogRecord that goes from v1 -> v3. A single PublishableEntity may have many DraftChangeLogRecords that describe its full draft edit history, but each DraftChangeLogRecord will be a part of a different DraftChangeLog.

New PublishableEntityVersions are created with a monotonically increasing version_num for their PublishableEntity. However, knowing that is not enough to accurately reconstruct how the Draft changes over time because the Draft does not always point to the most recently created PublishableEntityVersion. We also have the concept of side-effects, where we consider a PublishableEntity to have changed in some way, even if no new version is explicitly created.

The following scenarios may occur:

Scenario 1: old_version is None, new_version.version_num = 1

This is the common case when we’re creating the first version for editing.

Scenario 2: old_version.version_num + 1 == new_version.version_num

This is the common case when we’ve made an edit to something, which creates the next version of an entity, which we then point the Draft at.

Scenario 3: old_version.version_num >=1, new_version is None

This is a soft-deletion. We never actually delete a row from the PublishableEntity model, but set its current Draft version to be None instead.

Scenario 4: old_version.version_num > new_version.version_num

This can happen if we “discard changes”, meaning that we call reset_drafts_to_published(). The versions we created before resetting don’t get deleted, but the Draft model’s pointer to the current version has been reset to match the Published model.

Scenario 5: old_version.version_num + 1 < new_version.version_num

Sometimes we’ll have a gap between the two version numbers that is > 1. This can happen if we make edits (new versions) after we called reset_drafts_to_published. PublishableEntityVersions are created with a monotonically incrementing version_num which will continue to go up with the next edit, regardless of whether Draft is pointing to the most recently created version or not. In terms of (old_version, new version) changes, it could look like this:

  • (None, v1): Initial creation

  • # Publish happens here, so v1 of this PublishableEntity is published.

  • (v1, v2): Normal edit in draft

  • (v2, v3): Normal edit in draft

  • (v3, v1): Reset to published happened here.

  • (v1, v4): Normal edit in draft

This could also technically happen if we change the same entity more than once in the the same bulk_draft_changes_for() context, thereby putting them into the same DraftChangeLog, which forces us to squash the changes together into one DraftChangeLogRecord.

Scenario 6: old_version is None, new_version > 1

This edge case can happen if we soft-deleted a published entity, and then called reset_drafts_to_published before we published that soft-deletion. It would effectively undo our soft-delete because the published version was not yet marked as deleted.

Scenario 7: old_version == new_version

This means that the data associated with the Draft version of an entity has changed purely as a side-effect of some other entity changing.

The main example we have of this are containers. Imagine that we have a Unit that is at v1, and has unpinned references to various Components that are its children. The Unit’s version does not get incremented when the Components are edited, because the Unit container is defined to always get the most recent version of those Components. We would only make a new version of the Unit if we changed the metadata of the Unit itself (e.g. the title), or if we added, removed, or reordered the children.

Yet updating a Component intuitively changes what we think of as the content of the Unit. Users who are working on Units also expect that a change to a Component will be reflected when looking at a Unit’s “last updated” info. The old_version == new_version convention lets us represent that in a useful way because that Unit is a part of the change set represented by a DraftChangeLog, even if its own versioned data hasn’t changed.

Parameters:
  • id (BigAutoField) – Primary key: ID

  • dependencies_hash_digest (CharField) – Dependencies hash digest

Relationship fields:

Parameters:

Reverse relationships:

Parameters:
  • draft (Reverse ForeignKey from Draft) – All drafts of this Draft Change Log Record (related name of draft_log_record)

  • causes (Reverse ForeignKey from DraftSideEffect) – All causes of this Draft Change Log Record (related name of cause)

  • affected_by (Reverse ForeignKey from DraftSideEffect) – All affected by of this Draft Change Log Record (related name of effect)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.DraftSideEffect(*args, **kwargs)#

Bases: Model

Model to track when a change in one Draft affects other Drafts.

Our first use case for this is that changes involving child components are thought to affect parent Units, even if the parent’s version doesn’t change.

Side-effects are recorded in a collapsed form that only captures one level. So if Components C1 and C2 are both changed and they are part of Unit U1, which is in turn a part of Subsection SS1, then the DraftSideEffect entries are:

(C1, U1)
(C2, U1)
(U1, SS1)

We do not keep entries for (C1, SS1) or (C2, SS1). This is to make the model simpler, so we don’t have to differentiate between direct side-effects and transitive side-effects in the model.

We will record side-effects on a parent container whenever a child changes, even if the parent container is also changing in the same DraftChangeLog. The child change is still affecting the parent container, whether the container happens to be changing for other reasons as well. Whether a parent -child relationship exists or not depends on the draft state of the container at the end of a bulk_draft_changes_for context. To give concrete examples:

Setup: A Unit version U1.v1 has defined C1 to be a child. The current draft version of C1 is C1.v1.

Scenario 1: In the a bulk_draft_changes_for context, we edit C1 so that the draft version of C1 is now C1.v2. Result:

  • a DraftChangeLogRecord is created for C1.v1 -> C1.v2

  • a DraftChangeLogRecord is created for U1.v1 -> U1.v1

  • a DraftSideEffect is created with cause (C1.v1 -> C1.v2) and effect (U1.v1 -> U1.v1). The Unit draft version has not been incremented because the metadata a Unit defines for itself hasn’t been altered, but the Unit has changed in some way because of the side effect of its child being edited.

Scenario 2: In a bulk_draft_changes_for context, we edit C1 so that the draft version of C1 is now C1.v2. In the same context, we edit U1’s metadata so that the draft version of U1 is now U1.v2. U1.v2 still lists C1 as a child entity. Result:

  • a DraftChangeLogRecord is created for C1.v1 -> C1.v2

  • a DraftChangeLogRecord is created for U1.v1 -> U1.v2

  • a DraftSideEffect is created with cause (C1.v1 -> C1.v2) and effect (U1.v1 -> U1.v2)

Scenario 3: In a bulk_draft_changes_for context, we edit C1 so that the draft version of C1 is now C1.v2. In the same context, we edit U1’s list of children so that C1 is no longer a child of U1.v2. Result:

  • a DraftChangeLogRecord is created for C1.v1 -> C1.v2

  • a DraftChangeLogRecord is created for U1.v1 -> U1.v2

  • no SideEffect is created, since changing C1 does not have an impact on the current draft of U1 (U1.v2). A DraftChangeLog is considered a single atomic operation, so there was never a point at which C1.v1 -> C1.v2 affected the draft state of U1.

Parameters:

id (BigAutoField) – Primary key: ID

Relationship fields:

Parameters:
exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.EntityList(*args, **kwargs)#

Bases: Model

EntityLists are a common structure to hold parent-child relations.

EntityLists are not PublishableEntities in and of themselves. That’s because sometimes we’ll want the same kind of data structure for things that we dynamically generate for individual students (e.g. Variants). EntityLists are anonymous in a sense–they’re pointed to by ContainerVersions and other models, rather than being looked up by their own identifiers.

Parameters:

id (BigAutoField) – Primary key: ID

Reverse relationships:

Parameters:
  • entitylistrow (Reverse ForeignKey from EntityListRow) – All entity list rows of this entity list (related name of entity_list)

  • container_versions (Reverse ForeignKey from ContainerVersion) – All container versions of this entity list (related name of entity_list)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

property rows#

Convenience method to iterate rows.

I’d normally make this the reverse lookup name for the EntityListRow -> EntityList foreign key relation, but we already have references to entitylistrow_set in various places, and I thought this would be better than breaking compatibility.

class openedx_content.models_api.EntityListRow(*args, **kwargs)#

Bases: Model

Each EntityListRow points to a PublishableEntity, optionally at a specific version.

There is a row in this table for each member of an EntityList. The order_num field is used to determine the order of the members in the list.

Parameters:

Relationship fields:

Parameters:
exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.LearningPackage(*args, **kwargs)#

Bases: Model

Top level container for a grouping of authored content.

Each PublishableEntity belongs to exactly one LearningPackage.

Parameters:
  • id (IDField) – Primary key: Id

  • uuid (UUIDField) – UUID

  • package_ref (MultiCollationCharField) – Package ref

  • title (MultiCollationCharField) – Title

  • description (MultiCollationTextField) – Description

  • created (DateTimeField) – Created

  • updated (DateTimeField) – Updated

Reverse relationships:

Parameters:
  • publishable_entities (Reverse ForeignKey from PublishableEntity) – All publishable entities of this Learning Package (related name of learning_package)

  • draftchangelog (Reverse ForeignKey from DraftChangeLog) – All Draft Change Logs of this Learning Package (related name of learning_package)

  • publishlog (Reverse ForeignKey from PublishLog) – All Publish Logs of this Learning Package (related name of learning_package)

  • collection (Reverse ForeignKey from Collection) – All Collections of this Learning Package (related name of learning_package)

  • media (Reverse ForeignKey from Media) – All Media of this Learning Package (related name of learning_package)

  • component (Reverse ForeignKey from Component) – All Components of this Learning Package (related name of learning_package)

  • container (Reverse ForeignKey from Container) – All containers of this Learning Package (related name of learning_package)

exception DoesNotExist#

Bases: ObjectDoesNotExist

class IDField(*args, **kwargs)#

Bases: TypedAutoField

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.Media(*args, **kwargs)#

Bases: Model

This is the most primitive piece of content.

This model serves to lookup, de-duplicate, and store text and files. A piece of Media is identified purely by its data, the media type, and the LearningPackage it is associated with. It has no version or file name metadata associated with it. It exists to be a dumb blob of data that higher level models like ComponentVersions can assemble together.

# In-model Text vs. File

That being said, the Media model does have some complexity to accomodate different access patterns that we have in our app. In particular, it can store data in two ways: the text field and a file (has_file=True) A Media object must use at least one of these methods, but can use both if it’s appropriate.

Use the text field when: * the content is a relatively small (< 50K, usually much less) piece of text * you want to do be able to query across many rows at once * low, predictable latency is important

Use file storage when: * the content is large, or not text-based * you want to be able to serve the file content directly to the browser

The high level tradeoff is that text will give you faster access, and file storage will give you a much more affordable and scalable backend. The backend used for files will also eventually allow direct browser download access, whereas the text field will not. But again, you can use both at the same time if needed.

# Association with a LearningPackage

Media is associated with a specific LearningPackage. Doing so allows us to more easily query for how much storge space a specific LearningPackage (likely a library) is using, and to clean up unused data.

When we get to borrowing Media across LearningPackages, it’s likely that we will want to copy them. That way, even if the originating LearningPackage is deleted, it won’t break other LearningPackages that are making use if it.

# Media Types, and file duplication

Media is almost 1:1 with the files that it pushes to a storage backend, but not quite. The file locations are generated purely as a product of the LearningPackage UUID and the Media’s hash_digest, but Media also takes into account the media_type.

For example, say we had a Media with the following data:

[“hello”, “world”]

That is legal syntax for both JSON and YAML. If you want to attach some YAML-specific metadata in a new model, you could make it 1:1 with the Media that matched the “application/yaml” media type. The YAML and JSON versions of this data would be two separate Media rows that would share the same hash_digest value. If they both stored a file, they would be pointing to the same file location. If they only used the text field, then that value would be duplicated across the two separate Media rows.

The alternative would have been to associate media types at the level where this data was being added to a ComponentVersion, but that would have added more complexity. Right now, you could make an ImageMedia 1:1 model that analyzed images and created metatdata entries for them (dimensions, GPS) without having to understand how ComponentVerisons work.

This is definitely an edge case, and it’s likely the only time collisions like this will happen in practice is with blank files. It also means that using this table to measure disk usage may be slightly inaccurate when used in a LearningPackage with collisions–though we expect to use numbers like that mostly to get a broad sense of usage and look for major outliers, rather than for byte-level accuracy (it wouldn’t account for the non-trivial indexing storage costs either).

# Immutability

From the outside, Media should appear immutable. Since the Media is looked up by a hash of its data, a change in the data means that we should look up the hash value of that new data and create a new Media if we don’t find a match.

That being said, the Media model has different ways of storing that data, and that is mutable. We could decide that a certain type of Media should be optimized to store its text in the table. Or that a media type that we had previously only stored as text now also needs to be stored on in the file storage backend so that it can be made available to be downloaded. These operations would be done as data migrations.

# Extensibility

Third-party apps are encouraged to create models that have a OneToOneField relationship with Media. For instance, an ImageMedia model might join 1:1 with all Media that has image/* media types, and provide additional metadata for that data.

Parameters:
  • id (IDField) – Primary key: Id

  • size (PositiveBigIntegerField) – Size

  • hash_digest (CharField) – Hash digest

  • has_file (BooleanField) – Has file

  • text (MultiCollationTextField) – Text

  • created (DateTimeField) – Created

Relationship fields:

Parameters:

Reverse relationships:

Parameters:
exception DoesNotExist#

Bases: ObjectDoesNotExist

class IDField(*args, **kwargs)#

Bases: TypedBigAutoField

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

clean()#

Make sure we’re actually storing something.

If this Media has neither a file or text data associated with it, it’s in a broken/useless state and shouldn’t be saved.

file_url() str#

This will sometimes be a time-limited signed URL.

property mime_type: str#

The IANA media type (a.k.a. MIME type) of the Media, in string form.

MIME types reference:

https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types

os_path()#

The full OS path for the underlying file for this Media.

This will not be supported by all Storage class types.

This will return None if there is no backing file (has_file=False).

property path#

Logical path at which this content is stored (or would be stored).

This path is relative to OPENEDX_LEARNING[‘MEDIA’] configured storage root. This file may not exist because has_file=False, or because we haven’t written the file yet (this is the method we call when trying to figure out where the file should go).

For historical reasons (and backwards compatibility), the prefix for this path is “content/” and not “media/”.

read_file() File#

Get a File object that has been open for reading.

We intentionally don’t expose an open() call where callers can open this file in write mode. Writing a Media file should happen at most once, and the logic is not obvious (see write_file).

At the end of the day, the caller can close the returned File and reopen it in whatever mode they want, but we’re trying to gently discourage that kind of usage.

write_file(file: File) None#

Write file contents to the file storage backend.

This function does nothing if the file already exists. Note that Media is supposed to be immutable, so this should normally only be called once for a given Media row.

class openedx_content.models_api.MediaType(*args, **kwargs)#

Bases: Model

Stores Media types for use by Media models.

This is the same as MIME types (the IANA renamed MIME Types to Media Types). We don’t pre-populate this table, so APIs that add Media must ensure that the desired Media Type exists.

Media types are written as {type}/{sub_type}+{suffix}, where suffixes are seldom used. Examples:

  • application/json

  • text/css

  • image/svg+xml

  • application/vnd.openedx.xblock.v1.problem+xml

We have this as a separate model (instead of a field on Media) because:

  1. We can save a lot on storage and indexing for Media if we’re just storing foreign key references there, rather than the entire content string to be indexed. This is especially relevant for our (long) custom types like “application/vnd.openedx.xblock.v1.problem+xml”.

  2. These values can occasionally change. For instance, “text/javascript” vs. “application/javascript”. Also, we will be using a fair number of “vnd.” style of custom content types, and we may want the flexibility of changing that without having to worry about migrating millions of rows of Media.

Parameters:
  • id (AutoField) – Primary key: Id

  • type (MultiCollationCharField) – Type

  • sub_type (MultiCollationCharField) – Sub type

  • suffix (MultiCollationCharField) – Suffix

Reverse relationships:

Parameters:

media (Reverse ForeignKey from Media) – All Media of this media type (related name of media_type)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.PublishLog(*args, **kwargs)#

Bases: Model

There is one row in this table for every time content is published.

Each PublishLog has 0 or more PublishLogRecords describing exactly which PublishableEntites were published and what the version changes are. A PublishLog is like a git commit in that sense, with individual PublishLogRecords representing the files changed.

Open question: Empty publishes are allowed at this time, and might be useful for “fake” publishes that are necessary to invoke other post-publish actions. It’s not clear at this point how useful this will actually be.

The absence of a version_num field in this model is intentional, because having one would potentially cause write contention/locking issues when there are many people working on different entities in a very large library. We already see some contention issues occuring in ModuleStore for courses, and we want to support Libraries that are far larger.

If you need a LearningPackage-wide indicator for version and the only thing you care about is “has something changed?”, you can make a foreign key to the most recent PublishLog, or use the most recent PublishLog’s primary key. This should be monotonically increasing, though there will be large gaps in values, e.g. (5, 190, 1291, etc.). Be warned that this value will not port across sites. If you need site-portability, the UUIDs for this model are a safer bet, though there’s a lot about import/export that we haven’t fully mapped out yet.

Parameters:

Relationship fields:

Parameters:

Reverse relationships:

Parameters:

records (Reverse ForeignKey from PublishLogRecord) – All records of this Publish Log (related name of publish_log)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.PublishLogRecord(*args, **kwargs)#

Bases: Model

A record for each publishable entity version changed, for each publish.

To revert a publish, we would make a new publish that swaps old_version and new_version field values.

If the old_version and new_version of a PublishLogRecord match, it means that the definition of the entity itself did not change (i.e. no new PublishableEntityVersion was created), but something else was published that had the side-effect of changing the published state of this entity. For instance, if a Unit has unpinned references to its child Components (which it almost always will), then publishing one of those Components will alter the published state of the Unit, even if the UnitVersion does not change.

Parameters:

Relationship fields:

Parameters:

Reverse relationships:

Parameters:
  • published (Reverse ForeignKey from Published) – All Published Entities of this Publish Log Record (related name of publish_log_record)

  • causes (Reverse ForeignKey from PublishSideEffect) – All causes of this Publish Log Record (related name of cause)

  • affected_by (Reverse ForeignKey from PublishSideEffect) – All affected by of this Publish Log Record (related name of effect)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.PublishSideEffect(*args, **kwargs)#

Bases: Model

Model to track when a change in one Published entity affects others.

Our first use case for this is that changes involving child components are thought to affect parent Units, even if the parent’s version doesn’t change.

Side-effects are recorded in a collapsed form that only captures one level. So if Components C1 and C2 are both published and they are part of Unit U1, which is in turn a part of Subsection SS1, then the PublishSideEffect entries are:

(C1, U1)
(C2, U1)
(U1, SS1)

We do not keep entries for (C1, SS1) or (C2, SS1). This is to make the model simpler, so we don’t have to differentiate between direct side-effects and transitive side-effects in the model.

Parameters:

id (BigAutoField) – Primary key: ID

Relationship fields:

Parameters:
exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.PublishableContentModelRegistry#

Bases: object

This class tracks content models built on PublishableEntity(Version).

classmethod register(content_model_cls: type[PublishableEntityMixin], content_version_model_cls: type[PublishableEntityVersionMixin])#

Register what content model maps to what content version model.

If you want to call this from another app, please use the register_publishable_models function in this app’s api module instead.

class openedx_content.models_api.PublishableEntity(*args, **kwargs)#

Bases: Model

This represents any publishable thing that has ever existed in a LearningPackage. It serves as a stable model that will not go away even if these things are later unpublished or deleted.

A PublishableEntity belongs to exactly one LearningPackage.

Examples of Publishable Entities#

Components (e.g. VideoBlock, ProblemBlock), Units, and Sections/Subsections would all be considered Publishable Entites. But anything that can be imported, exported, published, and reverted in a course or library could be modeled as a PublishableEntity, including things like Grading Policy or possibly Taxonomies (?).

How to use this model#

The publishing app understands that publishable entities exist, along with their drafts and published versions. It has some basic metadata, such as identifiers, who created it, and when it was created. It’s meant to encapsulate the draft and publishing related aspects of your content, but the publishing app doesn’t know anything about the actual content being referenced.

You have to provide actual meaning to PublishableEntity by creating your own models that will represent your particular content and associating them to PublishableEntity via a OneToOneField with primary_key=True. The easiest way to do this is to have your model inherit from PublishableEntityMixin.

Identifiers#

The UUID is globally unique and should be treated as immutable.

The key field is mutable, but changing it will affect all PublishedEntityVersions. They are locally unique within the LearningPackage.

If you are referencing this model from within the same process, use a foreign key to the id. If you are referencing this PublishedEntity from an external system/service, use the UUID. The key is the part that is most likely to be human-readable, and may be exported/copied, but try not to rely on it, since this value may change.

Note: When we actually implement the ability to change identifiers, we should make a history table and a modified attribute on this model.

Why are Identifiers in this Model?#

A PublishableEntity never stands alone–it’s always intended to be used with a 1:1 model like Component or Unit. So why have all the identifiers in this model instead of storing them in those other models? Two reasons:

  • Published things need to have the right identifiers so they can be used throughout the system, and the UUID is serving the role of ISBN in physical book publishing.

  • We want to be able to enforce the idea that “entity_ref” is locally unique across all PublishableEntities within a given LearningPackage. Component and Unit can’t do that without a shared model.

That being said, models that build on PublishableEntity are free to add their own identifiers if it’s useful to do so.

Why not Inherit from this Model?#

Django supports multi-table inheritance:

We don’t use that, primarily because we want to more clearly decouple publishing concerns from the rest of the logic around Components, Units, etc. If you made a Component and ComponentVersion models that subclassed PublishableEntity and PublishableEntityVersion, and then accessed component.versions, you might expect ComponentVersions to come back and be surprised when you get EntityVersions instead.

In general, we want freedom to add new Publishing models, fields, and methods without having to worry about the downstream name collisions with other apps (many of which might live in other repositories). The helper mixins will provide a little syntactic sugar to make common access patterns more convenient, like file access.

param id:

Primary key: Id

type id:

~openedx_content.applets.publishing.models.publishable_entity.IDField

param uuid:

UUID

type uuid:

~django.db.models.UUIDField

param entity_ref:

Entity ref

type entity_ref:

~openedx_django_lib.fields.MultiCollationCharField

param created:

Created

type created:

~django.db.models.DateTimeField

param can_stand_alone:

Can stand alone. Set to True when created independently, False when created as part of a container.

type can_stand_alone:

~django.db.models.BooleanField

Relationship fields:

param learning_package:

Learning package (related name: publishable_entities)

type learning_package:

ForeignKey to LearningPackage

param created_by:

Created by (related name: publishableentity)

type created_by:

ForeignKey to User

Reverse relationships:

param versions:

All versions of this Publishable Entity (related name of entity)

type versions:

Reverse ForeignKey from PublishableEntityVersion

param affects:

All affects of this Publishable Entity (related name of dependencies)

type affects:

Reverse ManyToManyField from PublishableEntityVersion

param publishableentityversiondependency:

All publishable entity version dependencys of this Publishable Entity (related name of referenced_entity)

type publishableentityversiondependency:

Reverse ForeignKey from PublishableEntityVersionDependency

param draft:

The draft of this Publishable Entity (related name of entity)

type draft:

Reverse OneToOneField from Draft

param draftchangelogrecord:

All Draft Change Log Records of this Publishable Entity (related name of entity)

type draftchangelogrecord:

Reverse ForeignKey from DraftChangeLogRecord

param publishlogrecord:

All Publish Log Records of this Publishable Entity (related name of entity)

type publishlogrecord:

Reverse ForeignKey from PublishLogRecord

param published:

The Published Entity of this Publishable Entity (related name of entity)

type published:

Reverse OneToOneField from Published

param collections:

All collections of this Publishable Entity (related name of entities)

type collections:

Reverse ManyToManyField from Collection

param collectionpublishableentity:

All collection publishable entitys of this Publishable Entity (related name of entity)

type collectionpublishableentity:

Reverse ForeignKey from CollectionPublishableEntity

param component:

The Component of this Publishable Entity (related name of publishable_entity)

type component:

Reverse OneToOneField from Component

param entitylistrow:

All entity list rows of this Publishable Entity (related name of entity)

type entitylistrow:

Reverse ForeignKey from EntityListRow

param container:

The container of this Publishable Entity (related name of publishable_entity)

type container:

Reverse OneToOneField from Container

param testentity:

The test entity of this Publishable Entity (related name of publishable_entity)

type testentity:

Reverse OneToOneField from TestEntity

exception DoesNotExist#

Bases: ObjectDoesNotExist

class IDField(*args, **kwargs)#

Bases: TypedBigAutoField

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.PublishableEntityMixin(*args, **kwargs)#

Bases: Model

Convenience mixin to link your models against PublishableEntity.

Please see docstring for PublishableEntity for more details.

If you use this class, you MUST also use PublishableEntityVersionMixin and the publishing app’s api.register_publishable_models (see its docstring for details).

Relationship fields:

Parameters:

publishable_entity (OneToOneField to PublishableEntity) – Primary key: Publishable entity

class VersioningHelper(content_obj)#

Bases: object

Helper class to link content models to their versions.

The publishing app has PublishableEntity and PublishableEntityVersion. This is a helper class so that if you mix PublishableEntityMixin into a content model like Component, then you can do something like:

component.versioning.draft  # current draft ComponentVersion
component.versioning.published  # current published ComponentVersion

It links the relationships between content models and their versioned counterparts through the connection between PublishableEntity and PublishableEntityVersion. So component.versioning.draft ends up querying: Component -> PublishableEntity -> Draft -> PublishableEntityVersion -> ComponentVersion. But the people writing Component don’t need to understand how the publishing models work to do these common queries.

Caching Warning#

Note that because we’re just using the underlying model’s relations, calling this a second time will returned the cached relation, and not cause a fetch of new data from the database. So for instance, if you do:

# Create a new Component + ComponentVersion
component, component_version = create_component_and_version(
    learning_package_id=learning_package.id,
    namespace="xblock.v1",
    type="problem",
    component_code="monty_hall",
    title="Monty Hall Problem",
    created=now,
    created_by=None,
)

# This will work, because it's never been published
assert component.versioning.published is None

# Publishing happens
publish_all_drafts(learning_package.id, published_at=now)

# This will FAIL because it's going to use the relation value
# cached on component instead of going to the database again.
# You need to re-fetch the component for this to work.
assert component.versioning.published == component_version

# You need to manually refetch it from the database to see the new
# publish status:
component = get_component(component.id)

# Now this will work:
assert component.versioning.published == component_version

TODO: This probably means we should use a custom Manager to select related fields.

__init__(content_obj)#
property draft#

Return the content version object that is the current draft.

So if you mix PublishableEntityMixin into Component, then component.versioning.draft will return you the ComponentVersion that is the current draft (not the underlying PublishableEntityVersion).

If this is causing many queries, it might be the case that you need to add select_related('publishable_entity__draft__version') to the queryset.

property has_unpublished_changes#

Do we have unpublished changes?

The simplest way to implement this would be to check self.published vs. self.draft, but that would cause unnecessary queries. This implementation should require no extra queries provided that the model was instantiated using a queryset that used a select related that has at least publishable_entity__draft and publishable_entity__published.

property last_publish_log#

Return the most recent PublishLog for this component.

Return None if the component is not published.

property latest#

Return the most recently created version for this content object.

This can be None if no versions have been created.

This is often the same as the draft version, but can differ if the content object was soft deleted or the draft was reverted.

property published#

Return the content version object that is currently published.

So if you mix PublishableEntityMixin into Component, then component.versioning.published will return you the ComponentVersion that is currently published (not the underlying PublishableEntityVersion).

If this is causing many queries, it might be the case that you need to add select_related('publishable_entity__published__version') to the queryset.

version_num(version_num)#

Return a specific numbered version model.

property versions#

Return a QuerySet of content version models for this content model.

Example: If you mix PublishableEntityMixin into a Component model, This would return you a QuerySet of ComponentVersion models.

class openedx_content.models_api.PublishableEntityVersion(*args, **kwargs)#

Bases: Model

A particular version of a PublishableEntity.

This model has its own uuid so that it can be referenced directly. The uuid should be treated as immutable.

PublishableEntityVersions are created once and never updated. So for instance, the title should never be modified.

Like PublishableEntity, the data in this model is only enough to cover the parts that are most important for the actual process of managing drafts and publishes. You will want to create your own models to represent the actual content data that’s associated with this PublishableEntityVersion, and connect them using a OneToOneField with primary_key=True. The easiest way to do this is to inherit from PublishableEntityVersionMixin. Be sure to treat these versioned models in your app as immutable as well.

Parameters:

Relationship fields:

Parameters:

Reverse relationships:

Parameters:
  • publishableentityversiondependency (Reverse ForeignKey from PublishableEntityVersionDependency) – All publishable entity version dependencys of this Publishable Entity Version (related name of referring_version)

  • draft (Reverse OneToOneField from Draft) – The draft of this Publishable Entity Version (related name of version)

  • draftchangelogrecord (Reverse ForeignKey from DraftChangeLogRecord) – All Draft Change Log Records of this Publishable Entity Version (related name of new_version)

  • publishlogrecord (Reverse ForeignKey from PublishLogRecord) – All Publish Log Records of this Publishable Entity Version (related name of new_version)

  • published (Reverse OneToOneField from Published) – The Published Entity of this Publishable Entity Version (related name of version)

  • componentversion (Reverse OneToOneField from ComponentVersion) – The Component Version of this Publishable Entity Version (related name of publishable_entity_version)

  • containerversion (Reverse OneToOneField from ContainerVersion) – The container version of this Publishable Entity Version (related name of publishable_entity_version)

  • testentityversion (Reverse OneToOneField from TestEntityVersion) – The test entity version of this Publishable Entity Version (related name of publishable_entity_version)

exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.PublishableEntityVersionDependency(*args, **kwargs)#

Bases: Model

Track the PublishableEntities that a PublishableEntityVersion depends on.

For example, a partcular version of a Unit (U1.v1) might be defined to have unpinned references to Components C1 and C2. That means that any changes in C1 or C2 will affect U1.v1 via DraftSideEffects and PublishedSideEffects. We say that C1 and C2 are dependencies of U1.v1.

An important restriction is that a PublishableEntityVersion’s list of dependencies are defined when the version is created. It is not modified after that. No matter what happens to C1 or C2 (e.g. edit, deletion, un-deletion, reset-draft-version-to-published), they will always be dependencies of U1.v1.

If someone removes C2 from U1, then that requires creating a new version of U1 (so U1.v2).

This restriction is important because our ability to calculate and cache the state of “this version of this publishable entity and all its dependencies (children)” relies on this being true.

Parameters:

id (BigAutoField) – Primary key: ID

Relationship fields:

Parameters:
exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.PublishableEntityVersionMixin(*args, **kwargs)#

Bases: Model

Convenience mixin to link your models against PublishableEntityVersion.

Please see docstring for PublishableEntityVersion for more details.

If you use this class, you MUST also use PublishableEntityMixin and the publishing app’s api.register_publishable_models (see its docstring for details).

Relationship fields:

Parameters:

publishable_entity_version (OneToOneField to PublishableEntityVersion) – Primary key: Publishable entity version

class openedx_content.models_api.Published(*args, **kwargs)#

Bases: Model

Find the currently published version of an entity.

Notes:

  • There is only ever one published PublishableEntityVersion per PublishableEntity at any given time.

  • It may be possible for a PublishableEntity to exist only as a Draft (and thus not show up in this table).

  • If a row exists for a PublishableEntity, but the version field is None, it means that the entity was published at some point, but is no longer published now–i.e. it’s functionally “deleted”, even though all the version history is preserved behind the scenes.

TODO: Do we need to create a (redundant) title field in this model so that we can more efficiently search across titles within a LearningPackage? Probably not an immediate concern because the number of rows currently shouldn’t be > 10,000 in the more extreme cases.

TODO: Do we need to make a “most_recently” published version when an entry is unpublished/deleted?

Relationship fields:

Parameters:
exception DoesNotExist#

Bases: ObjectDoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

class openedx_content.models_api.Section(*args, **kwargs)#

Bases: Container

A Section is type of Container that holds Subsections.

Via Container and its PublishableEntityMixin, Sections are also publishable entities and can be added to other containers.

Parameters:

container_code (MultiCollationCharField) – Container code

Relationship fields:

Parameters:

Reverse relationships:

Parameters:

versions (Reverse ForeignKey from ContainerVersion) – All versions of this container (related name of container)

exception DoesNotExist#

Bases: DoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

classmethod validate_entity(entity: PublishableEntity) None#

Check if the given entity is allowed as a child of a Section

class openedx_content.models_api.SectionVersion(*args, **kwargs)#

Bases: ContainerVersion

A SectionVersion is a specific version of a Section.

Via ContainerVersion and its EntityList, it defines the list of Subsections in this version of the Section.

Relationship fields:

Parameters:
exception DoesNotExist#

Bases: DoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

property section: Section#

Convenience accessor to the Section this version is associated with

class openedx_content.models_api.Subsection(*args, **kwargs)#

Bases: Container

A Subsection is type of Container that holds Units.

Via Container and its PublishableEntityMixin, Subsections are also publishable entities and can be added to other containers.

Parameters:

container_code (MultiCollationCharField) – Container code

Relationship fields:

Parameters:

Reverse relationships:

Parameters:

versions (Reverse ForeignKey from ContainerVersion) – All versions of this container (related name of container)

exception DoesNotExist#

Bases: DoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

classmethod validate_entity(entity: PublishableEntity) None#

Check if the given entity is allowed as a child of a Subsection

class openedx_content.models_api.SubsectionVersion(*args, **kwargs)#

Bases: ContainerVersion

A SubsectionVersion is a specific version of a Subsection.

Via ContainerVersion and its EntityList, it defines the list of Units in this version of the Subsection.

Relationship fields:

Parameters:
exception DoesNotExist#

Bases: DoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

property subsection: Subsection#

Convenience accessor to the Subsection this version is associated with

class openedx_content.models_api.Unit(*args, **kwargs)#

Bases: Container

A Unit is type of Container that holds Components.

Via Container and its PublishableEntityMixin, Units are also publishable entities and can be added to other containers.

Parameters:

container_code (MultiCollationCharField) – Container code

Relationship fields:

Parameters:

Reverse relationships:

Parameters:

versions (Reverse ForeignKey from ContainerVersion) – All versions of this container (related name of container)

exception DoesNotExist#

Bases: DoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

classmethod validate_entity(entity: PublishableEntity) None#

Check if the given entity is allowed as a child of a Unit

class openedx_content.models_api.UnitVersion(*args, **kwargs)#

Bases: ContainerVersion

A UnitVersion is a specific version of a Unit.

Via ContainerVersion and its EntityList, it defines the list of Components in this version of the Unit.

Relationship fields:

Parameters:
exception DoesNotExist#

Bases: DoesNotExist

exception MultipleObjectsReturned#

Bases: MultipleObjectsReturned

property unit: Unit#

Convenience accessor to the Unit this version is associated with