- swipl
- library
- error.pl -- Error generating support
- apply.pl -- Apply predicates on a list
- lists.pl -- List Manipulation
- debug.pl -- Print debug messages and test assertions
- broadcast.pl -- Event service
- socket.pl -- Network socket (TCP and UDP) library
- predicate_options.pl -- Access and analyse predicate options
- shlib.pl -- Utility library for loading foreign objects (DLLs, shared objects)
- option.pl -- Option list processing
- uid.pl
- unix.pl -- Unix specific operations
- syslog.pl -- Unix syslog interface
- thread_pool.pl -- Resource bounded thread management
- gensym.pl -- Generate unique symbols
- settings.pl -- Setting management
- arithmetic.pl -- Extensible arithmetic
- main.pl -- Provide entry point for scripts
- readutil.pl -- Read utilities
- ssl.pl -- Secure Socket Layer (SSL) library
- crypto.pl -- Cryptography and authentication library
- filesex.pl -- Extended operations on files
- doc_http.pl -- Documentation server
- pldoc.pl -- Process source documentation
- operators.pl -- Manage operators
- pairs.pl -- Operations on key-value lists
- prolog_source.pl -- Examine Prolog source-files
- sgml.pl -- SGML, XML and HTML parser
- quasi_quotations.pl -- Define Quasi Quotation syntax
- uri.pl -- Process URIs
- url.pl -- Analysing and constructing URL
- www_browser.pl -- Open a URL in the users browser
- prolog_colour.pl -- Prolog syntax colouring support.
- record.pl -- Access compound arguments by name
- prolog_xref.pl -- Prolog cross-referencer data collection
- occurs.pl -- Finding and counting sub-terms
- ordsets.pl -- Ordered set manipulation
- assoc.pl -- Binary associations
- ugraphs.pl -- Graph manipulation library
- memfile.pl
- xpath.pl -- Select nodes in an XML DOM
- iostream.pl -- Utilities to deal with streams
- atom.pl -- Operations on atoms
- porter_stem.pl
- solution_sequences.pl -- Modify solution sequences
- prolog_pack.pl -- A package manager for Prolog
- prolog_config.pl -- Provide configuration information
- process.pl -- Create processes and redirect I/O
- git.pl -- Run GIT commands
- ctypes.pl -- Character code classification
- time.pl -- Time and alarm library
- utf8.pl -- UTF-8 encoding/decoding on lists of character codes.
- base64.pl -- Base64 encoding and decoding
- sha.pl -- SHA secure hashes
- crypt.pl
- persistency.pl -- Provide persistent dynamic predicates
- pure_input.pl -- Pure Input from files and streams
- nb_set.pl -- Non-backtrackable sets
- xsdp_types.pl -- XML-Schema primitive types
- uuid.pl -- Universally Unique Identifier (UUID) Library
- pcre.pl -- Perl compatible regular expression matching for SWI-Prolog
- aggregate.pl -- Aggregation operators on backtrackable predicates
- rdf_write.pl -- Write RDF/XML from a list of triples
- rdf.pl -- RDF/XML parser
- sgml_write.pl -- XML/SGML writer module
- archive.pl -- Access several archive formats
- csv.pl -- Process CSV (Comma-Separated Values) data
- dialect.pl -- Support multiple Prolog dialects
- apply_macros.pl -- Goal expansion rules to avoid meta-calling
- prolog_code.pl -- Utilities for reasoning about code
- dif.pl -- The dif/2 constraint
- thread.pl -- High level thread primitives
- rdf_triple.pl -- Create triples from intermediate representation
- rdf_parser.pl
- rewrite_term.pl
- rbtrees.pl -- Red black trees
- nb_rbtrees.pl -- Non-backtrackable operations on red black trees
- pengines.pl -- Pengines: Web Logic Programming Made Easy
- yall.pl -- Lambda expressions
- sandbox.pl -- Sandboxed Prolog code
- prolog_format.pl -- Analyse format specifications
- random.pl -- Random numbers
- pengines_io.pl -- Provide Prolog I/O for HTML clients
- zlib.pl -- Zlib wrapper for SWI-Prolog
- bdb.pl -- Berkeley DB interface
- hash_stream.pl -- Maintain a hash on a stream
- md5.pl -- MD5 hashes
- pprint.pl -- Pretty Print Prolog terms
- modules.pl -- Module utility predicates
- lazy_lists.pl -- Lazy list handling
- edinburgh.pl -- Some traditional Edinburgh predicates
- prolog_clause.pl -- Get detailed source-information about a clause
- prolog_breakpoints.pl -- Manage Prolog break-points
- dicts.pl -- Dict utilities
- backcomp.pl -- Backward compatibility
- paxos.pl -- A Replicated Data Store
- doc_latex.pl -- PlDoc LaTeX backend
- system.pl -- System utilities
- quintus.pl -- Quintus compatibility
- prolog_debug.pl -- User level debugging tools
- streampool.pl -- Input multiplexing
- stomp.pl -- STOMP client.
- date.pl -- Process dates and times
- statistics.pl -- Get information about resource usage
- listing.pl -- List programs and pretty print clauses
- snowball.pl -- The Snowball multi-lingual stemmer library
- unicode.pl -- Unicode string handling
- heaps.pl -- heaps/priority queues
- files.pl
- optparse.pl -- command line parsing
- threadutil.pl -- Interactive thread utilities
- edit.pl -- Editor interface
- help.pl -- Text based manual
- fastrw.pl -- Fast reading and writing of terms
- redis.pl -- Redis client
- prolog_stack.pl -- Examine the Prolog stack
- editline.pl -- BSD libedit based command line editing
- odbc.pl
- jpl.pl -- A Java interface for SWI Prolog 7.x
- zip.pl -- Access resource ZIP archives
- hashtable.pl -- Hash tables
- ansi_term.pl -- Print decorated text to ANSI consoles
- isub.pl -- isub: a string similarity measure
- terms.pl -- Term manipulation
- check.pl -- Consistency checking
- prolog_codewalk.pl -- Prolog code walker
- prolog_autoload.pl -- Autoload all dependencies
- shell.pl -- Elementary shell commands
- wfs.pl -- Well Founded Semantics interface
- portray_text.pl -- Portray text
- prolog_jiti.pl -- Just In Time Indexing (JITI) utilities
- strings.pl -- String utilities
- plunit.pl -- Unit Testing
- xmlenc.pl -- XML encryption library
- prolog_metainference.pl -- Infer meta-predicate properties
- c14n2.pl -- C14n2 canonical XML documents
- charsio.pl -- I/O on Lists of Character Codes
- make.pl -- Reload modified source files
- writef.pl -- Old-style formatted write
- qsave.pl -- Save current program as a state or executable
- sort.pl
- tables.pl -- XSB interface to tables
- varnumbers.pl -- Utilities for numbered terms
- when.pl -- Conditional coroutining
- intercept.pl -- Intercept and signal interface
- term_to_json.pl
- mqi.pl
- increval.pl -- Incremental dynamic predicate modification
- explain.pl -- Describe Prolog Terms
- prolog_wrap.pl -- Wrapping predicates
- win_menu.pl -- Console window menu
- tty.pl -- Terminal operations
- test_cover.pl -- Clause coverage analysis
- table.pl
- udp_broadcast.pl -- A UDP broadcast proxy
- xmldsig.pl -- XML Digital signature
- redis_streams.pl -- Using Redis streams
- yaml.pl -- Process YAML data
- prolog_trace.pl -- Print access to predicates
- prolog_stream.pl -- A stream with Prolog callbacks
- base32.pl -- Base32 encoding and decoding
- doc_files.pl -- Create stand-alone documentation files
- prolog_history.pl -- Per-directory persistent commandline history
- readline.pl -- GNU readline interface
- coinduction.pl -- Co-Logic Programming
- codesio.pl -- I/O on Lists of Character Codes
- protobufs.pl -- Google's Protocol Buffers ("protobufs")
- rlimit.pl
- double_metaphone.pl -- Phonetic string matching
- oset.pl -- Ordered set manipulation
- cgi.pl -- Read CGI parameters
- pdt_console.pl
- pwp.pl -- Prolog Well-formed Pages
- library
- isub(+Text1:text, +Text2:text, -Similarity:float, +Options:list) is det
- Similarity is a measure of the similarity/dissimilarity between
Text1 and Text2. E.g.
?- isub('E56.Language', 'languange', D, [normalize(true)]). D = 0.4226950354609929. % [-1,1] range ?- isub('E56.Language', 'languange', D, [normalize(true),zero_to_one(true)]). D = 0.7113475177304964. % [0,1] range ?- isub('E56.Language', 'languange', D, []). % without normalization D = 0.19047619047619047. % [-1,1] range ?- isub(aa, aa, D, []). % does not work for short substrings D = -0.8. ?- isub(aa, aa, D, [substring_threshold(0)]). % works with short substrings D = 1.0. % but may give unwanted values % between e.g. 'store' and 'spore'. ?- isub(joe, hoe, D, [substring_threshold(0)]). D = 0.5315315315315314. ?- isub(joe, hoe, D, []). D = -1.0.
This is a new version of isub/4 which replaces the old version while providing backwards compatibility. This new version allows several options to tweak the algorithm.
- Arguments:
-
Text1 - and Text2 are either an atom, string or a list of characters or character codes. Similarity - is a float in the range [-1,1.0], where 1.0 means most similar. The range can be set to [0,1] with the zero_to_one option described below. Options - is a list with elements described below. Please note that the options are processed at compile time using goal_expansion to provide much better speed. Supported options are: - normalize(+Boolean)
- Applies string normalization as implemented by the original
authors: Text1 and Text2 are mapped
to lowercase and the characters "._ " are removed. Lowercase
mapping is done with the C-library function
towlower()
. In general, the required normalization is domain dependent and is better left to the caller. See e.g., unaccent_atom/2. The default is to skip normalization (false
). - zero_to_one(+Boolean)
- The old isub implementation deviated from the original algorithm
by returning a value in the [0,1] range. This new isub/4 implementation
defaults to the original range of [-1,1], but this option can be set
to
true
to set the output range to [0,1]. - substring_threshold(+Nonneg)
- The original algorithm was meant to compare terms in semantic web ontologies, and it had a hard coded parameter that only considered substring similarities greater than 2 characters. This caused the similarity between, for example 'aa' and 'aa' to return -0.8 which is not expected. This option allows the user to set any threshold, such as 0, so that the similatiry between short substrings can be properly recognized. The default value is 2 which is what the original algorithm used.