Skip to main content

About ResourceReference

Interface org.apache.nifi.components.resource.ResourceReference; Part of the Java Framework

The PythonPropertyValue asResource() method returns an instance of a proxy object implementing the ResourceReference interface. Different resource types will be represented by various implementations of this interface, each with its own accessible methods, but all sharing the following set of attributes.

Attributes

NameTypeDescription
isAccessible()CallableReturns Boolean;

Indicates whether or not the resource is accessible
getLocation()CallableReturns String;

For a File or a Directory, this will be the full path name;

for a URL it will be the String form of the URL
getResourceType()CallableReturns ResourceType;

the type of resource that is being referenced;

See getResourceType
Omitted methods

The following methods were omitted, as their use is not advised:

  • asFile: Returns an instance of java.io.File class
  • asURL: Returns an instance of java.net.URL class
  • read: Returns an instance of java.io.InputStream class

See limitations.

e.g.:

from nifiapi.flowfilesource import (
FlowFileSource,
FlowFileSourceResult,
)
from nifiapi.properties import (
ProcessContext,
PropertyDescriptor,
ResourceDefinition
)
from json import dumps


class Processor(FlowFileSource):

(...)

PROPERTY = PropertyDescriptor(
name="Property",
description='''
This property accepts a file resource refference.
''',
required=True,
resource_definition=ResourceDefinition(allow_file=True)
)

(...)

def get_resource_summary(
self, context: ProcessContext, descriptor: PropertyDescriptor
) -> dict:
'''
Get a summary of a resource for a property supporting
a resource reference.

Parameters:
context (ProcessContext)
descriptor (PropertyDescriptor)

Returns:
List(Dict)
'''
resource = context.getProperty(descriptor).asResource()
summary = {
"type": resource.getResourceType().toString(),
"location": resource.getLocation(),
"is_accessible": resource.isAccessible()
}
self.logger.debug(dumps(summary, indent=4))

return summary

Consider the following example:
Input string: /tmp/json-movie-list/movies.json
Logger output:

{
"type": "file",
"location": "/tmp/json-movie-list/movies.json",
"is_accessible": true
}

getResourceType

The getResourceType() method returns an instance of the ResourceType Java Enum object, which does not have a direct counterpart in the Python framework. Working with this object directly may prove difficult, but this class offers a toString method that returns a string representation of the Enum key.

Possible values returned by the toString method:

  • file
  • directory
  • text
  • URL

e.g.:

# Access resource type definition
resource = context.getProperty(descriptor).asResource()
resource_type = resource.getResourceType().toString()

Limitations

In Java, ResourceReference provides a convenient interface for interacting with various resource types, offering functionality such as reading file contents, listing directories, and making HTTP requests. However, accessing these methods through the Python framework can cause substantial delays due to the need to serialize and deserialize each piece of information and is therefore not advised.

warning

In the below example, processing a 1.6 MB file took 63 seconds using Java components, compared to 0.01 seconds when reading the file directly in Python.

e.g.:

from nifiapi.flowfilesource import (
FlowFileSource,
FlowFileSourceResult,
)
from nifiapi.properties import (
ProcessContext,
PropertyDescriptor,
ResourceDefinition
)
from json import load, dumps
import time


class Processor(FlowFileSource):

PROPERTY = PropertyDescriptor(
name="Property",
description='''
This property accepts a file resource refference.
''',
required=True,
resource_definition=ResourceDefinition(allow_file=True)
)

(...)

def benchmark(
self, context: ProcessContext, descriptor: PropertyDescriptor
) -> None:
'''
Compare the performance of Java FileInputStream and json.load.

Parameters:
context (ProcessContext)
descriptor (PropertyDescriptor)

Returns:
None
'''

resource = context.getProperty(descriptor).asResource()
contents = []

# Benchmark FileInputStream
start = time.time()
stream = resource.read()
while stream.available() > 0:
contents.append(stream.read())
end = time.time()

self.logger.debug(f"Java stream processing time: {end - start}")

# Benchmark json.load
start = time.time()
with open(resource.getLocation(), 'rb') as file:
load(file)
end = time.time()

self.logger.debug(f"JSON load processing time: {end - start}")

return FlowFileSourceResult("success")