About ResourceReference
Interface
org.apache.nifi.components.resource.ResourceReference
; Part of the Java Framework
The PythonPropertyValue asResource()
method returns an instance of a proxy object implementing the ResourceReference
interface. Different resource types will be represented by various implementations of this interface, each with its own accessible methods, but all sharing the following set of attributes.
Attributes
Name | Type | Description |
---|---|---|
isAccessible() | Callable | Returns Boolean; Indicates whether or not the resource is accessible |
getLocation() | Callable | Returns String; For a File or a Directory, this will be the full path name; for a URL it will be the String form of the URL |
getResourceType() | Callable | Returns ResourceType ;the type of resource that is being referenced; See getResourceType |
Omitted methods
The following methods were omitted, as their use is not advised:
asFile
: Returns an instance ofjava.io.File
classasURL
: Returns an instance ofjava.net.URL
classread
: Returns an instance ofjava.io.InputStream
class
See limitations.
e.g.:
from nifiapi.flowfilesource import (
FlowFileSource,
FlowFileSourceResult,
)
from nifiapi.properties import (
ProcessContext,
PropertyDescriptor,
ResourceDefinition
)
from json import dumps
class Processor(FlowFileSource):
(...)
PROPERTY = PropertyDescriptor(
name="Property",
description='''
This property accepts a file resource refference.
''',
required=True,
resource_definition=ResourceDefinition(allow_file=True)
)
(...)
def get_resource_summary(
self, context: ProcessContext, descriptor: PropertyDescriptor
) -> dict:
'''
Get a summary of a resource for a property supporting
a resource reference.
Parameters:
context (ProcessContext)
descriptor (PropertyDescriptor)
Returns:
List(Dict)
'''
resource = context.getProperty(descriptor).asResource()
summary = {
"type": resource.getResourceType().toString(),
"location": resource.getLocation(),
"is_accessible": resource.isAccessible()
}
self.logger.debug(dumps(summary, indent=4))
return summary
Consider the following example:
Input string: /tmp/json-movie-list/movies.json
Logger output:
{
"type": "file",
"location": "/tmp/json-movie-list/movies.json",
"is_accessible": true
}
getResourceType
The getResourceType()
method returns an instance of the ResourceType
Java Enum object, which does not have a direct counterpart in the Python framework. Working with this object directly may prove difficult, but this class offers a toString
method that returns a string representation of the Enum key.
Possible values returned by the toString
method:
file
directory
text
URL
e.g.:
# Access resource type definition
resource = context.getProperty(descriptor).asResource()
resource_type = resource.getResourceType().toString()
Limitations
In Java, ResourceReference
provides a convenient interface for interacting with various resource types, offering functionality such as reading file contents, listing directories, and making HTTP requests. However, accessing these methods through the Python framework can cause substantial delays due to the need to serialize and deserialize each piece of information and is therefore not advised.
In the below example, processing a 1.6 MB file took 63 seconds using Java components, compared to 0.01 seconds when reading the file directly in Python.
e.g.:
from nifiapi.flowfilesource import (
FlowFileSource,
FlowFileSourceResult,
)
from nifiapi.properties import (
ProcessContext,
PropertyDescriptor,
ResourceDefinition
)
from json import load, dumps
import time
class Processor(FlowFileSource):
PROPERTY = PropertyDescriptor(
name="Property",
description='''
This property accepts a file resource refference.
''',
required=True,
resource_definition=ResourceDefinition(allow_file=True)
)
(...)
def benchmark(
self, context: ProcessContext, descriptor: PropertyDescriptor
) -> None:
'''
Compare the performance of Java FileInputStream and json.load.
Parameters:
context (ProcessContext)
descriptor (PropertyDescriptor)
Returns:
None
'''
resource = context.getProperty(descriptor).asResource()
contents = []
# Benchmark FileInputStream
start = time.time()
stream = resource.read()
while stream.available() > 0:
contents.append(stream.read())
end = time.time()
self.logger.debug(f"Java stream processing time: {end - start}")
# Benchmark json.load
start = time.time()
with open(resource.getLocation(), 'rb') as file:
load(file)
end = time.time()
self.logger.debug(f"JSON load processing time: {end - start}")
return FlowFileSourceResult("success")